2023-01-27 23:58 UTC Deciding which bugs to fix
Software has an infinite number of bugs. How can we tell which ones to fix?
I propose that it makes the most sense to optimize for people-happiness per unit bug fixing time, maximizing how much our effort improves the product for our users.
To put it in mathematical terms, we want to fix bugs with the highest N·ΔH / T, where:
- N is the number of people the bug affects
- ΔH is the increase of happiness per user affected by the bug
- T is an estimate of the amount of time it will take us to fix the bug
(These metrics are very hard to estimate. Don't worry too much about precision here.)
Bugs that improve T for future bugs
The best bugs to fix are those that make us more productive in the future. Reducing test flakiness, reducing technical debt, increasing the number of team members who are able to review code confidently and well: this all makes future bugs easier to fix, which is a huge multiplier to our overall effectiveness and thus to developer happiness.
Bugs affecting more people are more valuable (maximize N)
We will make more people happier if we fix a bug experienced by more people.
One thing to be careful about is to think about the number of people we are ignoring in our metrics. For example, if we had a bug that prevented our product from working on Windows, we would have no Windows users, so the bug would affect nobody. However, fixing the bug would enable millions of developers to use our product, and that's the number that counts.
Bugs with greater impact on developers are more valuable (maximize ΔH)
A slight improvement to the user experience is less valuable than a greater improvement. For example, if our application, under certain conditions, shows a message with a typo, and then crashes because of an off-by-one error in the code, fixing the crash is a higher priority than fixing the typo.
Bugs that are easier to fix are more valuable (minimize T)
The less time we spend working on something, the more time we will have to work on other things. Naturally, therefore, all else being equal, easier bugs are more impactful than harder bugs because we can fix more of the easier bugs in the same time.
This can feel counterintuitive. Surely fixing hard things is more valuable? Well, no. Having impact is better, and all other things being equal, it's more impactful to fix two easy bugs than one hard bug.
Steps to reproduce make a bug more valuable
If a bug has steps to reproduce, we will have a much easier time fixing it. In general, we should focus on bugs like that rather than those where the first step will be determining what the problem even is, because in the time it would take us to figure out a problem, we could have fixed multiple issues where the problem was clear.
Again, we will make more users happier if we fix more bugs each affecting X people than if we fix fewer (but gnarlier) bugs each affecting X people.
Exceptions
A high-profile hard-to-reproduce bug may warrant the extra effort, because the number of people affected is high. We want to take into account the total impact of fixing the bug as well as the time it will take to fix it.
Deciding when to move on
Sometimes, T can turn out to be bigger than estimated. Something looks easy, but turns out to be hard. The right choice may be to dump all one has learnt into the tracking issue and move on to something that one can solve more quickly.
Deciding between tasks of equal merit
Sometimes, it's not easy to decide which of two or three or ten tasks should be prioritized. The icon button's splash radius is too large on a toolbar. Users can't tap on menu items that haven't appeared yet during a popup menu animation. The shadow on the toolbar doesn't quite extend to the far left of the screen. Which of these should we work on, if we only have the time to work on one? It can seem difficult to decide.
The key realization to solving this conundrum is both freeing and mildly unsettling: it doesn't matter. We can do whichever one we feel like.
It doesn't matter because they are (by definition) equally important, and (by definition) we can do only one. Whichever one we do, some people will be happier. Assuming that, across the project, we pick among these choices more or less randomly, we will avoid introducing any particular bias and the product as a whole will get better.
To put it another way: in either case, we are improving the product by the same people-happiness per unit bug fixing time. So the product gets better by the same amount.
This doesn't mean any one of these bugs or features is not important. It just means that they are equally important, and one won the lottery and got fixed.