Mattermost uses Elasticsearch in large deployments to reduce the stress the database suffers when running search queries while returning even better,

Making a Postgres query 1,000 times faster

submited by
Style Pass
2024-05-15 06:00:09

Mattermost uses Elasticsearch in large deployments to reduce the stress the database suffers when running search queries while returning even better, fine-tuned results. For this to work, Elasticsearch needs to index all data we want to search for, so that it can retrieve it quickly when requested. Once the data is indexed, all works as expected, our users are happy, our developers are happy, and life is good.

However, I recently tested something I haven’t tried in a while: indexing a fairly large database (with 100 million posts) completely from scratch. When the database is already indexed, subsequent indexes of new posts and files are quite fast, so the normal usage of Elasticsearch is flawless, but an index from scratch is slow:

This screenshot is our job system informing us that the Elasticsearch indexing job has been running for around 18 hours, and hasn’t even finished half of what it needs to do 🙁 And the progress was not linear, slowing down more and more the further it progressed! Something was clearly wrong here.

Let’s start the investigation by identifying what exactly is slow here, since there are many moving parts: it could be the database, the Mattermost server, the Elasticsearch server, the network, or an under-resourced machine.

Leave a Comment