Everyone is thrilled with the new feature you’ve just deployed! But as it starts to gain popularity, you wonder if there might be a bug despite all the testing and code review that you and your team have done. The more it gets used, the more you start to notice some frustrating, hard-to-reproduce errors: database deadlocks.
This class of errors is often discovered only after heavy usage in production — seemingly rare and unlikely to start with — but can become crippling under a critical mass of simultaneous web users and parallel background jobs.
I’d like to share a technique for identifying the root cause of a deadlock, how Ruby on Rails can sometimes be a confounding factor, and a solution I’ve contributed that has all but eliminated deadlocks at Aha!
To understand how deadlocks happen and where to start looking, let’s review the minimum ingredients needed to cause the most basic scenario: