Indexing in Postgres seems simple, but it’s important to understand the basics of how it really works and the best practices for preventing system d

When Postgres Indexing Went Wrong

submited by
Style Pass
2024-09-24 09:30:04

Indexing in Postgres seems simple, but it’s important to understand the basics of how it really works and the best practices for preventing system downtime.

At a previous company, we managed a high-volume Postgres instance with billions of rows of transactional data. As we scaled, query performance became a key priority, and one of the first optimizations was adding indexes. To avoid downtime, we used CREATE INDEX CONCURRENTLY, which allows indexing large tables without locking out writes for hours. Initially, p99 query performance improved dramatically.

A few weeks later, another team launched a new feature that was built to rely heavily on the new index. Everything seemed routine—until the traffic spiked.

At first, the problem was subtle. A few queries took longer than expected. But within hours, the load began to spike. Query response times slowed to a crawl, and some requests were timing out.

We couldn’t immediately see why. The index was in place, a quick EXPLAIN ANALYZE confirmed it was being used. But users were still experiencing massive slowdowns, and we were on the brink of a full-scale production outage.

Leave a Comment