I’m not saying that someone actually uttered these exact words, but I’m pretty sure we all thought them. That was a month ago, when we’d decided

Behind the Scenes of Creating the World’s Biggest Graph Database

submited by

Style Pass

2021-06-17 17:00:15

I’m not saying that someone actually uttered these exact words, but I’m pretty sure we all thought them. That was a month ago, when we’d decided to try and build the biggest graph database that has ever existed.

It showed that, for a 1TB database, throughput and latency improve linearly with the number of shards that it’s distributed across. More shards, more performance.

The results looked good and confirmed that we had a very good understanding of the approach to scaling a graph database. Development of Fabric continued toward making it an integral part of Neo4j.

New technologies were created and improved upon (server-side routing is a good example) and made useful for non-Fabric setups as well.

But, that 1TB dataset from FOSDEM was always nagging us. 1TB is not that big, at least for Neo4j. We routinely have production setups with 10TB or more and, although they run on considerably large machines, Neo4j scales up pretty well.

We didn’t really need a solution for that; we needed a solution for really big databases. That’s why we had created Fabric, but we hadn’t found its limit yet.