How tcmalloc Works | James Golick

submited by

Style Pass

2024-11-17 17:00:07

tcmalloc is a memory allocator that's optimized for high concurrency situations. The tc in tcmalloc stands for thread cache — the mechanism through which this particular allocator is able to satisfy certain (often most) allocations locklessly. It's probably the most well-conceived piece of software I've ever had the pleasure of reading, and although I can't realistically cover every detail, I'll do my best to go over the important points.

Like most modern allocators, tcmalloc is page-oriented, meaning that the internal unit of measure is usually pages rather than bytes. This has the effect of making it easier to reduce fragmentation, and increase locality in various ways. It also makes keeping track of metadata far simpler. tcmalloc defines a page as 8192 bytes[1], which is actually 2 pages on most linux systems.

Chunks can be thought of as divided in to two top-level categories. "Small" chunks are smaller than kMaxPages (defaults to 128) and are further divided in to size classes and satisfied by the thread caches or the central per-size class caches. "Large" chunks are >= kMaxPages and are always satisfied by the central PageHeap.