At first glance, you might be tempted to say that they will have the same performance. One of them allocates 80 gigabytes of memory while the other on

Memory access patterns and performance

submited by
Style Pass
2024-03-31 00:00:06

At first glance, you might be tempted to say that they will have the same performance. One of them allocates 80 gigabytes of memory while the other only 20. However, the latter iterates over this memory 4 times, meaning there are the same exact number of memory writes per call. This is a reasonable first stab, but it’s wrong.

You might then take a closer look and pay more attention to the amount of memory being allocated. There’s got to be some overhead to allocating a gigabyte of memory, right? The first program does so 80 times, and the second only 20. Thus, maybe the second program runs faster, because it does not need to allocate as much memory. As it turns out, memory allocation time is basically constant for large chunks like this, so this is also the wrong intuition.

This would probably be an even more significant difference if I was running on a machine with slower disk. Alas, this is happening on an M3 Macbook pro, with very fast disk, so there’s “only” a 4 second difference.

Leave a Comment