Several people asked me about this: what's the rigorous research that code review helps?
Most of our data comes from case studies: where we follow a single group and see how code review affected their existing system. These studies are incredibly useful for real world data. t.co/bzubo2yGhb
One of our best sources here is the Smartbear analysis of their clients: They estimate that code review, done right, catches 70-90% of bugs. They also found in at least one case, code review cost half as much as letting bugs reach production.smartbear.com/SmartBear/medi…
The Smartbear stuff also found that many bugs would not have been caught by regular QA and testing, and also here that self-analysis also helps a ton:
While very positive, Smartbear shouldn't be our only source, as they're selling a product.ibm.com/developerworks…
This group did case studies on open source projects, and found a strong correlation between review discussion and code quality:
And this study found students could catch almost 70% of defects in "design" reviews: smartbear.com/SmartBear/medi…
pitt.edu/~ckemerer/PSP_…
Here's a large-scale github analysis finding positive impacts for security:
This article about how code review compares to pair programming seems to suggest it's just as good, but I have significant doubts about their methods: www2.eecs.berkeley.edu/Pubs/TechRpts/…
link.springer.com/article/10.102…
On the social side, about 97% of Google engineers were positive on code review. It also apparently pushed them to write smaller, more contained commits:
One of the most interesting results, though, comes from Microsoft.sback.it/publications/i…
The majority of their code review comms were about code quality, communication, and understanding:
This suggests defect benefits are secondary to social benefits. We know how effective CR is for finding bugs, so the social benefit could be significant.microsoft.com/en-us/research…
Now this is all interesting evidence, but it isn't a slam-dunk. On the other hand... with pretty much every other technique we have conflicted or inconclusive evidence. CR is the only thing we have tons of significant favourable evidence showing strong benefits.
That's basically unheard of for a programming technique, and the reason I believe code review is the one technical practice we definitely, absolutely know improves our software.
Some updates:
The Smartbear paper defines defects as "anything that requires a change to be acceptable." This tricked me before: their 70-90% is for all defects, both bugs and code improvement suggestions, not just bugs. So they aren't likely to catch 90% of bugs :(
Most studies find that about a quarter of issues found in code reviews are about bugs, the rest are quality improvement issues () and (). These are also valuable! But what does it mean for _bug detection_?ieeexplore.ieee.org/document/46046…
testroots.org/assets/papers/…
Most studies still show a strong effect: code review is very effective at finding bugs. For example, in this case study () they find one of the biggest indicators of post-release bugs is "poor code review". Most case studies and trials find similar.dl.acm.org/citation.cfm?i…
This is one of the reasons we've gotta be careful and read closely!
Share this Scrolly Tale with your friends.
A Scrolly Tale is a new way to read Twitter threads with a more visually immersive experience.
Discover more beautiful Scrolly Tales like this.
