Thursday, 30 October 2014

Some thoughts on microservices

One of the themes that reoccurred through many of the talks at JAX London this year was microservices. Now, I'll be the first to admit that I really don't like the term microservices because it's very ambiguous: what's a service? And why is it "micro"? In comparison to what exactly? What does it do? However, regardless of arguments over the nomenclature, it's clear that service-oriented architectures are widely discussed. One may even say they're a bit of a buzzword right now.

If your organisation is structured similarly to Spotify's matrix (and we find it works well), then there will be few cross-team dependencies. Each team will able to get on and build stuff without anything or anybody getting in their way. Some teams will build new features, some will improve various parts of the platform, and others may focus on scaling the architecture. The general gist is that your engineering teams are all going to be building very different parts of the system with little to no overlap. Naturally, your team, when faced with the task of building a new large architectural feature, will jump at the opportunity to stop contributing to the big ball of mud that they usually commit to. "Let's have our own repo!" they cry.

Conway's Law conjectures that the architecture of the software that is produced in an organisation reflects the communication paths and team structures, and to an extent I think that's true. In an ideal world, each team would like to have their own codebase and process. This pushes the inter-team complexity down and keeps the need to look at "that" code at a minimum. The team may want to control their release cycle, to do their own continuous deployment, and generally feel like they have complete ownership over what they are producing and when they are producing it. This is great for team morale. A new feature can be built as a standalone application. Some months down the line, another team start building another new feature that would get out of the door really quickly if they were able to reuse the work that the previous team did, but neither of them want to work in each others' repositories; it just feels wrong. How can an interface be provided for other internal teams to work with? Thinking about features as services can act as guidance here.

You may find yourself faced with a monolithic codebase already, and you want to reuse a part of it from another application in your architecture. This would be a good opportunity to pull that code out into a separate repository and run it as a standalone application, as you can decrease the complexity of the monolithic code at the same time. Just decide on an interface for the other parts of your architecture that are calling it, and away you go. It may even be a good opportunity to write that code from scratch if it's sufficiently small enough, knowing what you do now. Why not open source it while you're at it?

Dealing with scale can naturally steer you towards a service-oriented architecture. Perhaps there is a part of your data collection process that is becoming a bottleneck. It could be split out into a service. This brings additional benefits such as allowing multiple instances to be run for load balancing purposes, allowing you to horizontally scale that part of the code without horizontally scaling the rest of it at the same time.

There's no right and wrong answer here, and there's no silver bullet for all situations. It may help to categorise what sort of contract your service will have with the rest of the system. Here are some examples.

RESTful services

If your service provides timely responses to requests, then a REST API may be a good approach. For example, part of your architecture may compute a set of recommended products based on a given product. Using a REST API also gives the added benefit of considering allowing external access in the future, either for free or for a price. Spring Boot allows you to get webapps up and running extremely quickly, and I'd recommend looking at it for a new project. They also have examples of how to write REST consumer applications so that your services can talk to each other with minimal effort.

Pipeline services

You may be splitting up a data collection pipeline because a certain part is a bottleneck. Since this area of the system will never have an external facing API, using a message broker to pass intermediate data is a good idea. We've been using Apache Kafka for this at Brandwatch with great success. You can decide how to distribute the load between your various instances with a lot of flexibility. I gave a talk on how we're using leader election to do that in our event-detection pipeline.

Slow services

Some services can take a long time to provide a final response. For example, you may be firing off a batch job like updating a large portion of a search index, or performing a large MapReduce task. A REST API could work well here, with the request returning the location that the output is expected to be stored ahead of time. Polling can wait for it to appear, and this kind of task can be delegated to a background thread in your application.

It's worth bearing in mind that while splitting your architecture into smaller services can reduce the complexity of the code and make it easier to understand various parts of the system in isolation, it pushes out the complexity into managing how the services communicate with each other. If you change the REST responses of a service, or alter the Java class that you're serialising to send down the wire, which other parts of the system will it break? It's hard to know at compile time. A team might not know that they are subtly breaking the interface with another team's service. Communication, monitoring and regression testing are extremely important here.

So, services. I like them. Maybe you will too, but be careful and apply them gently.

No comments:

Post a Comment