Clean Code Chapter 11

I read Clean Code several years ago, and I have been rereading it with some co-workers the past few weeks. On my original read-through, I tended to discount a lot of what Martin had to say as idealistic opinion, but I've changed many of my opinions about software engineering since then, so I have been enjoying reflecting on the chapters as we have read them. I still think it's reasonable to take some of what Martin has to say with a grain of salt. He is hyperbolic and the book is old; some of the specific examples aren't relevant anymore (the testing chapter in particular has some horrific examples).

With that in mind, we ran into some problems with Chapter 11, and I thought it would be worthwhile to flesh out my thoughts on the chapter. Chapter 11 is about systems design, and in it, Martin has some thought-provoking things to say about how malleable systems should be over time.

Unfortunately, my co-workers, and I think most modern readers, read the implementation advice (to use a specific style of Java proxies) and (correctly) said, "This sounds awful, none for me, thanks!" I contend, though, that the thesis of this chapter has nothing to do with Java proxies, and while the advice Martin offers sounds trite, he is actually touching on some concepts worth examining.

What is the actual point of the chapter?

Most of the chapter claims that design doesn't all have to happen up front. I think some folks think they must design their system correctly up front or they risk creating technical debt that will never be cleaned up. I think that not only is that impossible, but that it's harmful, because it's likely that your understanding of how your software will be used is subtly wrong. It's likely that software will need to change in ways you can't possibly expect, so it would be better to design something as quickly as possible while allowing for future changes. This is really the important takeaway of the chapter, in my opinion.

Martin makes the point that if you identify the boundaries between subsystems well, even if the systems themselves are poorly thought out, it should be easy to completely replace those subsystems one at a time without bringing the system to a halt.

This idea is important, because many developers are too “rewrite happy.” By that, I mean that they are frustrated with a particular system or subsystem and believe that they can rewrite it with perfect backward compatibility more quickly than they can rewrite the system in place. If the system is small, that might even be true. I think, though, that in systems that have been around for years or decades, it is very hard to meet feature parity in the timeline most companies would be comfortable committing to for a rewrite. For that reason, I think that most rewrites must be done in place or not at all.

The metaphor of city building

The chapter begins by talking about the problems of cities, and it draws metaphors to the development process at two different levels: organizational and developmental. Martin in particular talks about the idea of an organizational unit, which has to do with individual units that have specializations that make the larger system work. Every city has different roles that are focused on different levels of abstraction. For example, mayors probably don’t need to care about the formatting of birth certificate applications. They do care that certificates are being issued. At the same time, the county clerk doesn't need to know about signage for businesses on main street. Everyone has their own job, and the system works well when everyone is focused on their particular level of abstraction (and maybe one or two above or below it).

This separation of levels is exactly how your code should be organized. There are some classes that exist to orchestrate the interaction among other classes. These orchestration classes don't actually know how the tasks are to be performed, but they need to make sure that the classes that do know are set up correctly. This is like electing a mayor to make sure that a city is running well. Just like a city, software works best (and is easiest to evolve) when classes aren't trying to manage the entire process. Attempting to manage the entire process in one class will result in massive "god classes" that make refactoring more difficult by allowing future developers to tightly couple their changes to things that they shouldn't.

Cities are not exactly like software, though, and the development side of a city is where Martin draws some distinctions. For example, widening a road can cause a lot of pain (temporary delays, etc.) for people who interact with that road. Software isn't physical. We can always wrap some abstraction. Performance and complexity might be marginally worse, but you can use abstraction to make temporary changes (that can be made permanent later) possible while people are still using the software.

How do you actually make software abstractly?

Before getting into exactly how to build abstractions that make maintenance easier, it's worth acknowledging that there are a lot of people who fundamentally disagree with what Martin has to say here. They believe that it is not possible to draw nice boundaries in all or (even any) situation, and that you should instead focus on other ways (rewrites, upfront design documentation) to mitigate the complexity of making changes in a large code base. I personally don't agree with that viewpoint (as I mentioned above), but Martin's is a controversial take that I think it's easy to gloss over because of the slick way he presents his opinions.

Drawing boundaries

I have encountered the idea that abstraction is something that "other developers" encounter (OS, networking, etc). I think that mindset is preventing us from seeing our code as a transformation on top of information.

If we visualize our data as passing through the pipeline of our code, it becomes more obvious that it would be easy to separate two pipes to attach a new stage to the pipeline. I think this metaphor works well with interfaces. Interfaces can act as an abstraction between two (or more) types that makes it easy to quickly identify where/how new functionality should be slotted in. After all, it's far easier to take two interfaces and say that a new orchestrator should sit between them and that new orchestrator should delegate something to a new system than it is to rewrite a large class that was performing the same function, probably breaking tests and unknown functionality.

I see a perception that code is easy to write when it is all in one script. The developer understands the process they are coding as a series of steps that must be completed from start to finish. That viewpoint isn't necessarily wrong, but writing code as individual instructions in a single function will lead to thousands of lines of code, disorganized abstractions, and a maintenance nightmare. Instead, we should be getting away from this "scripting approach" and identifying meaningful abstractions that can orchestrate subsystems that perform useful actions, rather than manually hard coding steps in a process. Below, I have outlined several specific patterns for breaking down systems into subsystems from Martin's writing, but there are many more.

Inversion of control

IoC helps separate the specific implementations of things from the code that needs to "use" those implementations. The goal is to be able to dramatically change the details of how a system functions with the minimal amount of code (eg. swapping a SQL database for an OO database). There are a lot of ways to do that, such as Separation of main, Factories, and Dependency Injection. Each of these topics is too complex for me to get into in this post, but I will come back to them in the future.

Cross cutting concerns & aspect oriented programming

Inevitably, there are some things in any system that require the interaction of different subsystems. For example, It is generally accepted that it's a good idea to separate out business logic and isolate it from framework or other kinds of logic. One example would be logging.

So what do you do about things that don't sit nicely in one abstraction; something that many different, unrelated abstractions might need to share? These are called cross cutting concerns. As an example, Martin offers DB persistence. He claims that business logic objects shouldn't include functionality to persist themselves in their inheritance chain.

There are a lot of ways to handle cross cutting concerns, but Martin puts forward "aspect oriented programming." Aspect oriented programming is a technique that involves creating individually simple business logic objects and composing them with systems that add functionality from other domains. Examples include the visitor pattern, decorator pattern, and proxies, which I will get into in future articles.

But what does all of this really mean?

This chapter is filled with controversial viewpoints. I think the core thesis of the chapter is that you can design a system in such a way that uncertain future changes are easy to make, and that the way to design such a system is to think very hard about where boundaries can be designed between subsystems and building interfaces at those boundaries.

Having these boundaries well defined does not mean that the resulting code is extensible in every way or that it is even fully functional for all use cases. What those boundaries do buy you is the ability for each part of the system to evolve independently.

The chapter also doesn't explain exactly why Martin has come to this view. It doesn't support why his view must be correct, it simply posits it to be so. I have largely come to Martin's defense in this summary so far and friends have asked me what I think of the chapter and the book as a whole. I really struggle with that question, because I think a lot of the conceptual advice on offer is extremely useful, but I also think that Martin doesn't really set out to write a textbook filled with facts. He is simply documenting his opinions and experiences, which means that the advice in the book isn't meant to be adopted uncritically, but to be measured and considered.

For that reason, I think this book should be required reading for undergraduates. Not because the advice is all good, but specifically because it's not. The specific recommendations that Martin puts forward are often obtuse to modern readers, but if you can dig past that, there is wisdom in his opinions, like this chapter about identifying boundaries between systems.