Debugging By The Numbers

The other day my team had an all-day meeting to try to debug a very weird, ugly problem with some accounts in the new billing system we’re finishing up implementing. During the process of trying to figure out what the root cause of the problem was, I went through a process I’ve been through a few times and thought I’d share.

The issue was with how some money was distributed to the account. Sometimes people get money back and it has to be used in a specific way. In this case it appeared it wasn’t being spread out the way it should have, and the numbers looked very strange. In the main example case, we were using there was a number on the account’s invoice that didn’t match anything. When you’re debugging, these mysterious numbers can be very useful. While everyone else was looking at other stuff, I took a little while to try to find where that number was coming from. Code (hopefully) doesn’t just invent numbers so it had to come from somewhere and I’ve had good luck in the past figuring out big problems just by figuring out the numbers.

In this case, the important numbers on their invoice were

A payment of $562 A credit of $629
A big refund of $2438 A check issued to the account for $1348
A credit balance of $562
First, we don’t usually do credit balances at all. It should have been $0. Then, $1348 didn’t immediately jump out as having any relation to the other numbers. Our non-technical project owner’s first inclination was to believe the program was making things up but I usually go on the assumption that this isn’t the case. :)

The first thing I figured out was that the $2438 had been split into 2 chunks of $1219 on 2 invoices. Since I didn’t know that this shouldn’t have happened (score another one for ignorance), I accepted it and figure out that $629 was $1219 - $562. So this was half the refund minus their payment, which is what should happen. Good.

I then saw that $1348 did have a relation to the other numbers, it was $629
2. I started going over this out loud for everybody (also an extremely valuable debugging technique) and it all fell into place. What I finally saw was that the $2438 had been split up over 2 invoices. Then both the month’s payments had been taken out of the refund -> $2438 - ($562 * 2) = $1348. The system had accounted for all the money the account would owe us, taken it out and refunded them the rest. It then held on the $562 for next month’s invoice in order to pay it off then. Whew. Only took about 2 hours.

So going over this math with a clear head and no expectation of what the system should have done, I found the underlying problem. Everything should have collapsed onto one invoice and done everything at once. The refund shouldn’t have been split up and both $562 payments should have been made at once, one of them shouldn’t have been held onto til next month. This is a big issue that makes people’s invoices look weird but the important thing for a billing system is that no money is missing. People had originally thought maybe we were over-paying accounts but that isn’t the case thankfully. Now we need to figure out how to fix it going forward but that’s a job for the billing people.

In the end, once again ignorance saves the day. I didn’t know about some of the particular workings of refunds in this case so I wasn’t making assumptions about that. I know I don’t know all the bits and pieces of how the invoices work so I went through the exercise of finding where the mysterious numbers came from and that led me to the answer. If you’re debugging numbers, writing it down and going through them all with a calculator is immensely helpful. Add them up, subtract them from each other, try to find where the differences are. And talk it out. Maybe you’re going down the wrong path or your lack of knowledge about something is something basic you do need to know. It’s debugging, it’s hard. There’s no map. But don’t let those mysterious numbers float out there, they might be the key to the answer.