Dolt's New Storage Engine

submited by
Style Pass
2022-05-20 21:30:07

In the beginning, there was Noms. The creation of Aaron Boodman and Attic Labs, Noms introduced Prolly Trees, a novel search index structure that supports diff, merge and sync operations. Noms development has since been halted, but its contributions live on in the open source community. Replicache and Dolt both use Prolly Tree based storage engines.

DoltHub was founded in 2018 with the vision of creating an open platform for data sharing and collaboration. We wanted to create Git-for-Data, and we built Dolt as our central collaboration tool. From the beginning, Dolt was built on Noms. It's why Dolt is the only SQL database that can branch, diff, merge, push, and pull. As DoltHub has grown, our focus has shifted from data-sharing to becoming a production database. This week we launched Hosted Dolt, our first cloud offering of Dolt. As our aims have changed, performance has become a major focus of development. Optimizing indexed access in our storage engine has been particularly important for improving Dolt's efficiency, and we've made significant progress on catching up to MySQL on benchmarks.

Incrementally optimizing Noms has provided substantial gains in efficiency, but we've reached a point of diminishing returns. One major barrier is the need to maintain compatibility with Noms binary serialization format. Altering this format is a major breaking change for Dolt customers and would require a database migration, but we believe it's the best path forward. Today we're announcing the alpha release of Dolt's new storage engine!

Leave a Comment