The Alder Lake SHLX anomaly - tavianator.com

submited by
Style Pass
2025-01-02 23:30:04

Apparently shlx is a "medium latency" (3 cycles) instruction on Alder Lake. My disappointment is immeasurable, and my day is ruined.

A bit of background: Alder Lake is the 12th generation of Intel Core processors. It's the first generation with a "hybrid architecture," containing both performance (P) and efficiency (E) cores. SHLX is a left-shift instruction introduced in the BMI2 instruction set. The main difference with SHL is that SHLX doesn't affect the FLAGS register. It's also a 3-operand instruction:

Left-shift is one of the simplest things to implement in hardware, so it's quite surprising that it should take 3 whole CPU cycles. It's been 1 cycle on every other CPU I'm aware of. It's even 1 cycle on Alder Lake's efficiency cores! Only the performance cores have this particular performance problem.

The 3-cycle figure Harold cited comes from uops.info. They even document the exact instruction sequence used in their benchmark that measured the 3-cycle latency, with a sample nanoBench command to reproduce it. Running that command on my laptop indeed measures 3 cycles of latency.

Leave a Comment