1-billion row challenge (1brc) is a challenge to process a 12GB file containing 1-billion rows of text. Each row is formatted as <stationName>;&

1-billion row challenge with Node.js

submited by
Style Pass
2025-01-08 15:00:07

1-billion row challenge (1brc) is a challenge to process a 12GB file containing 1-billion rows of text. Each row is formatted as <stationName>;<temperature>\n, and the goal is to aggregate the min, max, and average of each station.

For Node.js, the repository for the challenge can be found here. We will go through the implementation for the baseline approach, understand how it works and work on improving it until we reach a ~30x speedup.

Each line is split by the ; character to get the station name and temperature. This information is stored in a Map(). Also, the temperature is multiplied by 10 to avoid potential floating point errors.

On my machine with Apple M4 Pro, this implementation took 5m41.069s to finish the challenge. This will be our baseline to improve upon.

We will need some sort of a profiling tool to understand where a lot of the time is spent. I like using Clinic.js Flame for this purpose. It generates a flamegraph from the profiling data, which can be easier to understand than raw profiling data.

Leave a Comment