Applications, services and infrastructures are becoming ever more distributed. Meanwhile expectations for fast feedback loops and iterations in the so

Observability as a Day Zero Operation

submited by

Style Pass

2024-09-25 20:00:03

Applications, services and infrastructures are becoming ever more distributed. Meanwhile expectations for fast feedback loops and iterations in the software delivery process are greater than ever. Today we build systems with an expectation that they will fail – so you better have excellent troubleshooting tools. You need to watch and understand the system at baseline, and be ready to deal with surprises.

We’ve also had a long era of sprawl, where developers and organisations and engineering teams have had a lot of autonomy in the decisions they make, which has enabled productivity in some dimensions, but also major challenges in terms of complexity and lack of standardisation in databases, runtimes, monitoring tools, programming languages and so on. The cognitive overheads have become far harder to manage, for individuals, teams and organisations. Managing and monitoring these applications and services with all this complexity happening in multiple dimensions has required a new set of disciplines and tools, which we now generally call Observability. Although apparently, the definition is still in play – Observability 2.0 anyone?

One of the most critical aspects of the Observability revolution is the understanding that people managing the applications are often the people building them. Observability is a developer experience revolution as much as anything else. The last generation of monitoring, tracing and logging tools are no longer cutting it in terms of providing context, useful querying, meeting the user where they are. Observability is now something that happens across the entire software development lifecycle. With apologies to Robert M. Pirsig, author of Zen and The Art of Motorcycle Maintenance: