Thursday 30 October 2014

Some thoughts on microservices

One of the themes that reoccurred through many of the talks at JAX London this year was microservices. Now, I'll be the first to admit that I really don't like the term microservices because it's very ambiguous: what's a service? And why is it "micro"? In comparison to what exactly? What does it do? However, regardless of arguments over the nomenclature, it's clear that service-oriented architectures are widely discussed. One may even say they're a bit of a buzzword right now.

If your organisation is structured similarly to Spotify's matrix (and we find it works well), then there will be few cross-team dependencies. Each team will able to get on and build stuff without anything or anybody getting in their way. Some teams will build new features, some will improve various parts of the platform, and others may focus on scaling the architecture. The general gist is that your engineering teams are all going to be building very different parts of the system with little to no overlap. Naturally, your team, when faced with the task of building a new large architectural feature, will jump at the opportunity to stop contributing to the big ball of mud that they usually commit to. "Let's have our own repo!" they cry.

Conway's Law conjectures that the architecture of the software that is produced in an organisation reflects the communication paths and team structures, and to an extent I think that's true. In an ideal world, each team would like to have their own codebase and process. This pushes the inter-team complexity down and keeps the need to look at "that" code at a minimum. The team may want to control their release cycle, to do their own continuous deployment, and generally feel like they have complete ownership over what they are producing and when they are producing it. This is great for team morale. A new feature can be built as a standalone application. Some months down the line, another team start building another new feature that would get out of the door really quickly if they were able to reuse the work that the previous team did, but neither of them want to work in each others' repositories; it just feels wrong. How can an interface be provided for other internal teams to work with? Thinking about features as services can act as guidance here.

You may find yourself faced with a monolithic codebase already, and you want to reuse a part of it from another application in your architecture. This would be a good opportunity to pull that code out into a separate repository and run it as a standalone application, as you can decrease the complexity of the monolithic code at the same time. Just decide on an interface for the other parts of your architecture that are calling it, and away you go. It may even be a good opportunity to write that code from scratch if it's sufficiently small enough, knowing what you do now. Why not open source it while you're at it?

Dealing with scale can naturally steer you towards a service-oriented architecture. Perhaps there is a part of your data collection process that is becoming a bottleneck. It could be split out into a service. This brings additional benefits such as allowing multiple instances to be run for load balancing purposes, allowing you to horizontally scale that part of the code without horizontally scaling the rest of it at the same time.

There's no right and wrong answer here, and there's no silver bullet for all situations. It may help to categorise what sort of contract your service will have with the rest of the system. Here are some examples.

RESTful services

If your service provides timely responses to requests, then a REST API may be a good approach. For example, part of your architecture may compute a set of recommended products based on a given product. Using a REST API also gives the added benefit of considering allowing external access in the future, either for free or for a price. Spring Boot allows you to get webapps up and running extremely quickly, and I'd recommend looking at it for a new project. They also have examples of how to write REST consumer applications so that your services can talk to each other with minimal effort.

Pipeline services

You may be splitting up a data collection pipeline because a certain part is a bottleneck. Since this area of the system will never have an external facing API, using a message broker to pass intermediate data is a good idea. We've been using Apache Kafka for this at Brandwatch with great success. You can decide how to distribute the load between your various instances with a lot of flexibility. I gave a talk on how we're using leader election to do that in our event-detection pipeline.

Slow services

Some services can take a long time to provide a final response. For example, you may be firing off a batch job like updating a large portion of a search index, or performing a large MapReduce task. A REST API could work well here, with the request returning the location that the output is expected to be stored ahead of time. Polling can wait for it to appear, and this kind of task can be delegated to a background thread in your application.

It's worth bearing in mind that while splitting your architecture into smaller services can reduce the complexity of the code and make it easier to understand various parts of the system in isolation, it pushes out the complexity into managing how the services communicate with each other. If you change the REST responses of a service, or alter the Java class that you're serialising to send down the wire, which other parts of the system will it break? It's hard to know at compile time. A team might not know that they are subtly breaking the interface with another team's service. Communication, monitoring and regression testing are extremely important here.

So, services. I like them. Maybe you will too, but be careful and apply them gently.

Sunday 26 October 2014

Three years later

I would ordinarily scoff when I looked at a blog and saw that it hadn't been updated for a long period of time. I am now scoffing at myself. The last post that I made was just before I started working at Brandwatch, which was a few months after I finished my Ph.D., which was, more precisely, 3 years and 1 month ago. I've now been working here for longer than I was undertaking my doctoral studies and time has passed very quickly indeed. Work and life are very different now.

There's a lot to be said about teams of engineers tackling big problems. My doctoral years were very solitary in terms of my work. You toil on a deep problem in a niche so narrow that at the end of it all you are the world expert on it. This is an empowering proposition, however, it does mean that nobody truly shares your burden and successes along the way apart from yourself. I thrived in this environment because I can command a substantial amount of self-determination, and to an extent, stubbornness. This is what got me out of bed in the morning and kept me pushing through until late in the night, day in and day out. It paid off. But it's not sustainable. Research in academia is being overtaken by the pure creative power of industry. I'd make the conjecture that industry is solving the most important and interesting problems in computer science right now, and that's where I want to be.

The true joy of engineering is not sitting in an ivory tower and becoming a specialist on a particularly arcane area that few people know about. The true joy is building things with people, for people. Fundamentally, software engineers are not that different from traditional engineers. When Brunel built bridges or railways, he did so to solve people's problems; to connect people. Software engineering is the same, and connecting people resonates both inside and outside of the workshop.

Inside the workshop, engineers connect with each other to dream, design and build things that help enrich the lives of others. Teams of engineers, when they work together well and have proper guidance, can do incredible things. Working closely with exceptionally smart people has been extremely rewarding. My programming skills have improved vastly. My ability to transfer this knowledge on to others has also improved, and I feel the audience has been infinitely more attentive than those that I taught at the university (intriguingly those that were most receptive in my seminars now work for the same company). Through pull requests, pair programming and technical talks the interactions where I learn something new can be counted daily.

Outside of the workshop, engineers are connecting people together. When I read and hear feedback from our users that compliments our work, I feel genuinely happy to have made a difference. Even if the contribution was something small, it doesn't matter - we've made someone's day better: perhaps we fixed a common frustration, maybe we saved them a few minutes each morning, or we've delivered some great new functionality. When hundreds of clients turn up to one of our events to learn more about the platform and products we are building, I feel part of something much greater than myself. When I spoke at JAX London earlier this month, a whole room of people were really listening to what I had to say and I felt that I'd given something useful back to the community. That was the biggest struggle I had with academic work; collaboration seemed fueled by the need to publish more papers for one's own citation count, rather than tackling a problem the world really needed solved.

If you're an engineer and you want to make a difference, then know full well that you can. There's never been a more exciting time to be alive with the skills that you have. Join a start-up and help it grow into a world-class organisation. Join a world-class organisation and make it better. Work for yourself while traveling from place to place with a laptop and a minimal set of clothing - it's all possible. There's so much to build for everyone out there - there's just not enough time to do it all.