git bisect - how to quickly find a bug in your code?
During the lifetime of a software development project, people create a really big amount of commits. A lot of them :) In later stages of a project, during intensive refactorization of code, it's quite easy, despite existing QA process, to generate new bugs in your code.
In case of more complex problems, one of a possible approach is to find the moment in a lifetime of a project, when a bug was created. Instead of scanning through all existing codebase, we need to find a single commit, with a chunk of a faulty code.
Now we encounter a basic difficulty: How to efficiently find the very first commit, that introduced the problem?. The basic strategy is to find old, working commit, and slowly traverse commit by commit until we encounter a bug. We have a single problem here, between good and bad commits, we have a lot (eg. 300) commits, a time needed for verification would be quite long. On top of that, we need (in the worst case) to verify all 300 commits, which might be traumatic for testers. A big improvement is an automatic validation test (eg. a unit test), but if it takes a long time to run such a test, it's still not a perfect solution. Instead of re-inventing the wheel, we can use a robust and ready solution here: a little git command, git bisect
How does it work?
The idea behind this tool is simple, and it's related to a mathematical algorithm of root-finding using bisection method (https://en.wikipedia.org/wiki/Bisection_method). In other words, on each step, we halve our commits range. We check if a commit in the middle of the range is correct or not. If it's OK, we recursively repeat execution of an algorithm for a set that is built from verified commit and all following commits. If it's not OK, then we are checking for range build from tested commit and its predecessors. Our algorithm ends, when we cannot dive in into recursion - this will be our "zero place", first commit with the bug. It's possible to check and proof, that this algorithm needs O(log_2(N)) steps. So, for our example of 300 commits, we need to do only 9 tests.
OK, let's go to an example. Let's assume, that we don't have automatic tests, that could verify the problem, so we need to do our tests manually. An example git bisect session looks as follows:
# First, we need to define our search range
$ git bisect start
$ git bisect bad # Current commit in workspace is faulty
$ git bisect good da8d6c0d81b629 # It marks commit da8.... as valid.
Bisecting: 66 revisions left to test after this (roughly 6 steps)
[b7161b5b98b6b2fd46e0e5c0472f3949b9f25a7f] refs: #2630 | Description
# After manual testing - it's bad commit
$ git bisect bad # we are marking it as bad.
Bisecting: 32 revisions left to test after this (roughly 5 steps)
[dfa7f15eed8236b732844124c83b5711f7167d46] Next commit description.
# Set of git bisect good and git bisect bad answers...
In the end:
b7161b5b98b6b2fd46e0e5c0472f3949b9f25a7f is the first bad commit
Author: Jon Smith
Date: Thu May 3 10:58:45 2018 +0200
refs: #2630 | Bad commit description.
As you can see, we need 6 steps to find bug in 66 commit range.
If we have automatic tests, it's easy to use them with git bisect. In order to do so, please use run option git bisect run ./phpunit.sh. Let's leave such script as homework :)