Good Boundaries Make For Easier Debugging

We're a few days in talking about debugging. At this point you can articulate your bug clearly and reproduce it reliably. Now it's time to start locating the specific place on your code where reality is diverging from expectation.

That sure sounds tedious, doesn't it? Do you really need to go through your code line by line?

Ideally, not until you've narrowed things down a bit. If you have well-defined boundaries in your code, then you can first inspect behavior at those boundaries to determine the area where things might be going wrong.

So what do I mean when I say boundary?

Generally speaking, a boundary is the edge of a system. HTTP requests and responses are the boundaries between your server and client. SQL queries are the boundaries between your application code and your database. Dispatched actions and the emitted changes to state are boundaries to a Redux store. Functions arguments and return values are the boundaries to a function.

For debugging purposes, I like to say that a boundary is a place where information crosses explicitly, in a known format. In an ideal boundary, something on one side doesn't need to know anything about what's on the other side, except for the information that is expected to cross the boundary. I tell you to focus on boundaries because they are comparatively easy to evaluate for correctness. Remember we're still in the beginning stages. We're just trying to figure out where things are going wrong. We're still not worried about why.

Trace the execution path of the code that is relevant to the bug in question. Where can you first say that something has definitively gone wrong? Often that initial point is given to you in the stack trace of a thrown error. Now look upstream of that point and pick out a boundary where you can easily inspect the information crossing it. These can be function arguments, generated database queries, HTTP requests or responses, anything that you can simply inspect and say "yes this looks correct" or "no this looks wrong"

If the information looks correct, then the defect likely lies downstream of that point. If not, then something is going wrong upstream of this point. Pick another boundary upstream and repeat the process. Eventually you'll zero in on the smallest component that you can, where the information coming in looks correct, but going out looks wrong.

This may sound obvious, and you're likely doing something similar already when you debug, but there are two mistakes I see people make all the time when performing these steps. The first is that they start theorizing at the first bad boundary they come across. Remember we're waiting to form a theory until we can narrow down the location of the defect. A single boundary just tells you that something has gone wrong prior to this point.

The second mistake I see is going too quickly into line-by-line analysis of the code, or just dropping log statements in random places. There's nothing inherently wrong with these approaches, but they tend to be slower and not as focused than the boundary approach I just described. When you test at the boundaries, what you're really doing is testing discrete, known behaviors of defined systems. As I said, this makes it much easier to evaluate correctness.

It's getting late and my kids are hungry for dinner, so I'm going to have to show you examples tomorrow ;)

Did you like this?

I send a daily email with tips and ideas like this one. Join the party!