Summary: Building Evolutionary Architectures

I recently finished the book on Building Evolutionary Architectures and wanted to summarize it for future reference.

It started slowly with a lot of abstract concepts the authors call “-ilities” such as scalability and a lot of warnings about trade-offs, but it soon picked up as soon as they reached some concrete examples, both real world and imaginary. The most interesting part from the beginning of the book was an outline of GitHub’s migration practice and Netflix’s use of resiliency tools to achieve evolution.

Software architecture isn’t some purely technical concept, nor a collection of tools du juor the same way software architect are not (or at least should not) be solely focused on the technical aspects of the product or project. Architects and the teams they’re working with should always keep in mind the problem they’re trying to solve and shape the architecture to reflect the problem domain they’re working in. Given the variability of problem domains, the discipline and knowledge their teams posses, the budgetary restrictions, time-to-market and the emphasis of feature development, there will always be trade-offs, making one type of architecture more appropriate than the other.

To account for all that variability in the real world, not just in the technological space, a system’s architecture should evolve over time. To achieve that transition between different architectural states in a way that gives the teams a certain degree of security and at the same time enough freedom to adapt to changes, the authors focused on three key aspects:

incremental change: describes the degree to which a certain architecture is malleable, or how easy a change can be made and deployed;
fitness functions: (preferably) automatic and (where needed) semiautomatic or manual checks that inform us on the system’s health status;
architectural coupling: the degree to which certain components, modules or implemented abstractions are coupled together.

These criteria are best described using examples, and the authors used a couple of different architectural styles that ranged from the ones with the worst performance on these criteria (Big Ball of Mud) to the ones that perform best (microservices or service based architectures), with various examples in between (Layered Monolith, Modular Monolith, Service Oriented Architecture, etc.)

The goal of each architectural pattern is to split the real world into bounded contexts that can be represented by independent services or modules (loose coupling) that work by certain rules and abide by the contracts they’ve made to the consumers (fitness functions) in a way that doesn’t impede progress (incremental change). As always, there are trade-offs that also have an impact on the choice of architecture. For example, transactional scopes are often difficult to implement in a microservice architecture, but easier to handle in a monolith or a larger service. In any case, the goal is to have as much automation (fitness functions) as possible, especially when it comes to monitoring, Continuous Integration and Continuous Delivery.

Since most projects start off in a hectic environment with a limited amount of people, time and money, over time they turn into a Big Ball of Mud or into a Layered Monolith if there’s some discipline. Over time, the architecture can degrade even further, resulting in even more coupling that makes incremental change more difficult as time passes, and fitness functions even more difficult to set up due to unclear boundaries.

An alternative is to rewrite everything from scratch, but that’s oftentimes not an option, so the authors recommend a refactoring effort that strives to make the monolith more modular with clear boundaries between different contexts before a rearchitecting effort (if that’s even necessary). Oftentimes, the monolith is first made modular as much as possible before splitting it into larger standalone services that have a larger bounded context, but still retain most of the transactions. As the services grow independently, the same pattern can be applied to each individual service until you reach the point of diminishing returns, which is most often when splitting a microservice down to smaller components for the sake of reusability is not cost effective.

The authors aren’t fervent advocates of microservices, but present a balanced approach which can be summarized as “it depends”, on the context, use-case, domain, etc. However, one thing they do warn about is code reusability across microservices as opposed to duplication. On the face of it, duplication sounds bad, but if you think from the problem domain’s perspective, it makes sense to split responsibilities of a real world entity between services. For example, if both Shipping and Accounting service are connected to a Customer entity, but each has it’s own use-case, its own bounded context. Instead of developing a Customer service that would be shared by both Shipping and Accounting services, which would inevitably introduce unnecessary coupling, each service implements its own version of the Customer entity and notify the other of changes (if necessary) either directly via an API call or an event. In that case, each service can evolve independently, implement its own fitness functions that make sure it’s working as expected and is decoupled from the rest of the system to the best possible extent.

That leaves teams with the opportunity to develop and release changes to each service at their own cadence. The teams can also focus on their problem domain, their bounded context and make the best decisions they can to make their workflow even more efficient. This is especially true when it comes to developing products as opposed to working on projects. Projects are “easy”, you form a team, they do the required amount of work, the team disintegrates and the maintenance is handed over to another team. The problem with that approach is that the maintenance team is left with sorting out the bugs the first team introduced, and the first team is oblivious to the mistakes they make, which they will inevitably repeat on the next project. Product teams take more ownership because they’re left to deal with the consequences of their work. It’s in their best interest to do the best work they can at the moment and keep improving, both from a product and technical perspective.

All in all, it’s a great read, so if you made it this far, feel free to check it out for yourself and fill in the gaps with details I’ve omitted from this summary.