7 Lessons From 10 Outages

submited by
Style Pass
2021-06-22 22:00:09

After 10 post-mortems in their first season, Tom and Jamie reflect on the common issues they’ve seen. Click through for details!

We’re just about through our inaugural season of The Downtime Project podcast, and to celebrate, we’re reflecting back on recurring themes we’ve noticed in many of the ten outages we’ve poured over. It’s been remarkable how consistent certain patterns have been–either as risks or as assets–to the engineering teams as they’ve tackled these incidents.

Out of these recurring patterns we’ve extracted lessons that we intend to take into our own engineering teams; and so, we’ve compiled five of those lessons below for the benefit of any interested readers, with the hope that you, too, will find them useful to learn from and to prepare for. As we go, we’ll include links to the outages and episodes where each theme occurred.

If you feel like we’ve missed any major patterns, or have any other feedback for us, please leave a comment. And thank you all for listening to the first season of The Downtime Project.

Leave a Comment