Advice on Freeing Features From a Monolith
It’s a tale as old as time: Boy meets girl. Boy and girl build a service interface to help call center agents in assisting travelers with their Expedia-related travel issues. Service grows in features over time until it’s a giant, creaking monolith that’s vital to the business. Boy and girl decide to move to a microservice architecture to decouple themselves from the slow release cycle and annoying build times of said monolith.
Again, tale as old as time.
And that’s about where one of the services inside Expedia is now! The team has been working on a large, shared, monolithic service (which from here on we’ll call “the monolith”) over the past several months to build a new workflow for our call center agents to use when helping customers exchange flights. To make our architecture more flexible (both in terms of release cycle and in terms of composability with services running on different runtimes), we elected to build our new functionality in a microservice architecture, exposed via a handful of RESTful service APIs contained in a new service which we’ll call the Air Exchange Service.
This has worked out reasonably well. But as this was our team’s first true effort at a genuinely microservice-oriented architecture, we’ve noticed some things about the evolution of our nascent services that may provide valuable lessons to those who’d also like to follow such a path.
What Went Well
We created an API for determining whether a customer’s flight was eligible to be exchanged via our workflow. It accepts an air record locator—which uniquely identifies the itinerary to be exchanged— and returns what sorts of flight exchanges that customer can make. Simple enough, right? Because the interface was very small and declarative, we could retrofit it with additional logic later without forcing difficult and annoying changes on the client.
This came in handy when, shortly before release because of a last minute fraud prevention requirement (which I won’t go into here), we had to declare a certain kind of itinerary automatically ineligible for exchange (though they could still be canceled).
So our system for determining exchange eligibility started out looking like this:
And ended up looking like this:
And our monolith got access to this new functionality immediately, with no redeployment required. Go us!
- As predicted, our microservice architecture allowed us to decouple ourselves from the monolith’s release schedule, enabling us to push out bugfixes and new features daily. This came in handy when, in preparation for a big release, the monolith had a code freeze; we could still bugfix and iterate away on our own microservices, quite secure in the knowledge that the upstream client would be unaffected by our meddling.
- Microservices insulate clients from your internal architectural decisions, allowing you to test and reject new technologies without those changes being visible to consumers. For reasons beyond the scope of this article, at one point we had to switch from a function as a service to a SpringBoot application, which entailed several internal changes to our codebase in addition to tacking on all of Spring Boot’s dependencies. Because we’re running a microservice, not a library, none of those dependency changes were visible to clients. This would go for even more dramatic changes, from adding internal libraries to switching languages entirely. When you’re communicating over HTTPS, as the saying goes, nobody knows you’re a dog.
- Because we were using a new repo with only code we cared about inside of it, we had 45-second build times and trivial incremental compile times when running tests and writing new code. Our process is very iterative, with code reviewers frequently requesting small changes. Our short build times gave us dramatic improvements in our total turnaround time on new features and bugfixes—small changes could be written, tested, reviewed and merged in the space of a couple hours, compared to our monolithic repo where changes of similar scope would face a turnaround time of a day or more due to the time spent waiting on (multiple, iterative) builds to complete.
What Went Less Well
We created a Flights Shopper API, which contacts our supply team’s Search API that contacts back end flight providers to figure out what flights match the dates and times a customer wants for their journey. The interface to that flight provider we used was very complicated, requiring a lot of information to be sent on any given service call. Owing to schedule pressures, in lieu of putting a lot of effort into abstracting out this interface (no minor task), we had the public interface of our own API mostly mimic the interface of the supply team’s Search service. We ended up with a sort of “pass-through” service, in other words, that would accept some information, shuffle it around a bit, and pass it along to the Supply team, which then would return the flight provider’s response that we, in turn, would format and return to the client.
We hoped this would evolve to have its own set of business logic, which would, over time, transform it into something more than a passthrough service. However, our Shopper service’s interface echoed our dependency’s interface so closely that we had a lot of trouble making any kind of implementation changes without forcing changes to how the monolith was calling our service. If our dependency needed a “cabin code” field, for example, the monolith would have to be modified to provide that cabin code. If our dependency exposed a new field in its response, we’d have to send that field to the client explicitly to gain access to it (rather than roll it into the preexisting response, as in our last-minute scenario described above).
In short: our implementation details dictated our interface, a pattern which, once begun, resulted in bugfixes and new features rippling out to clients in a difficult-to-control fashion, and which didn’t leave a lot of room for the service to develop its own business logic.
I’ve noticed this antipattern in a couple of different software architectures, so it’s something to be on the lookout for. Every time you must modify and redeploy all your clients to accommodate a service change, that’s a time that using a microservice is costing you time and effort rather than saving it. Ideally, implementation details come and go, but interfaces are forever.
Ideally, implementation details come and go, but interfaces are forever.
- Keeping your interface small and declarative is always a good move. Doing so (as in our Exchange Eligibility example) allows you to retrofit your APIs with additional logic without interfering with your clients. Sending an itinerary ID rather than each itinerary attribute in our Shopper API may have enabled similar evolution (allowing the service itself to pick and choose the relevant parts of the itinerary to send along to the dependency).
- When crafting a new service, it’s helpful to enshrine in documentation a specific mandate that the service has. This mandate should make it clear how the service differs from any service it is dependent upon in terms of its responsibilities. With any luck, that’ll make it clear where new functionality should be onboarded.
- A common code smell is if you find your own services’ interfaces “mirroring” those of your dependencies (as in the Shopper example above). This implies your service may have an insufficient level of abstraction from the underlying implementation to allow easy evolution without changes rippling out to your interface, and therefore to your clients. Clients hate that.
This is doubly true if your new API is consuming an API that your organization owns. There should be an unambiguous answer to the question: “If I want to onboard X new feature, should that go in my new API, or should it go in the API I’m calling?” Uncertainty on this point– or an inability to suggest a potential feature that could be added to the new API, beyond just “expose a new feature of my underlying dependency” — signals a muddied architectural boundary between the two that will probably haunt you later.
- Microservices require a small initial time investment, but pay for themselves quickly in terms of new feature turnaround time if they’re crafted with the right implementation-hiding interface. Since feature turnaround time is a high priority of most (ours at least) development teams, this is probably the most generally-applicable advantage of this sort of architecture.
Sandi Metz has an excellent video called Less: The Path To Better Design, which goes over these subtle conceptual issues with defining interfaces in some detail. Her examples specifically refer to code-level interfaces, rather than service interfaces, but they apply nevertheless!