Hi, it’s us again, the ones who used to store our database in a single JSON file on disk, and then moved to etcd. Time for another change! We’re going to put everything in a single file on disk again.
As you might expect from our previous choice (and as many on the internet already predicted), we ran into some limits with etcd. Database size, write transaction frequency, of particular note: generating indexes. All of these were surmountable limits, but we were quickly running into the biggest limit: me. Until now, I have been allowed to choose just about anything for a database as long as I do all the work. But at a certain point, that doesn’t scale. The way to solve the issues with etcd was bespoke code. Every time someone else had to touch it, I had to explain it. Especially the indexing code. (Sorry.) What we need is something easier to dive into and get back out of quickly, a database similar enough to common development systems that other engineers, working hard to solve other problems, don’t have to get distracted by database internals to solve their problem.
Reaching this team engineering scaling limit was entirely predictable, though it happened faster than we thought it would. So we needed something different, something more conservative than our previous choices. The obvious candidates were MySQL (or one of its renamed variants given who bought it) or PostgreSQL, but several of us on the team have operational experience running these databases and didn’t enjoy the prospect of wrestling with the ops overhead of making live replication work and behave well. Other databases like CockroachDB looked very tempting, but we had zero experience with it. And we didn’t want to lock ourselves into a cloud provider with a managed product like Spanner. We have several requirements in our previous blog post that still apply, such as being able to run our entire test suite locally and hermetically quickly and easily, ideally without VMs or containers.