One of my recent hobby projects was a blocks-based tool for doing basic data science, and I was very pleased when its test suite reached 100% line coverage. However, Inozemtseva2014 found that code coverage is a poor predictor of how effective a test suite is at detecting bugs once the size of the test suite is accounted for. Here's the method they used to reach this conclusion:
This method allowed the authors to tackle much larger programs and test suites than they could otherwise examine. Their conclusions?
In other words, more tests do find more bugs, but it's the number of tests and not their code coverage that has most of the predictive value. It's a surprising result, so if you'll excuse me, I have a couple of lecture slides on software testing I need to revise…
Inozemtseva2014 Laura Inozemtseva and Reid Holmes: "Coverage is not strongly correlated with test suite effectiveness". Proceedings of the 36th International Conference on Software Engineering, 10.1145/2568225.2568271.