Scuba, a distributed in-memory database, is the system at Facebook that aggregates events for monitoring.  This post sources its material from the two

Scuba: Diving into the extraordinary system Facebook uses to analyse millions of events per second

submited by
Style Pass
2024-04-03 10:30:05

Scuba, a distributed in-memory database, is the system at Facebook that aggregates events for monitoring. This post sources its material from the two Scuba papers, namely: “Scuba: Diving into Data at Facebook” and “Fast Database Restarts at Facebook”, giving preference to the second. It’s written as if the two papers were one, with powerful visualizations. It’s concise as it strips away discussions about alternative decisions. Ideally, reading the papers is encouraged after going through this post as they give additional details here and there. It also discusses recent improvements at the end. Scuba has had 4 generations [3]. The bulk of this article focuses on the 1st generation and the 2nd generation partially. The 3rd and 4th generations are touched when public data is available.

When a Facebook engineer (Thanks James Luo) shared Scuba with me I was blown away. The massive scale at which it operates and its mind-boggling performance, usefulness, and love index at Facebook motivated me to explore its internals, as well as the Facebook ecosystem like Scribed, Calligraphus, Hive, LogDevice, FbThrift, and Scribe. Let’s start.

Leave a Comment