If you're looking to write fast code in Rust, good news! Rust makes it really easy to write really fast code. The focus on zero-cost abstractions, the lack of implicit boxing and the static memory management means that even naïve code is often faster than the equivalent in other languages, and certainly faster than naïve code in any equally-safe language. Maybe, though, like most programmers you've spent your whole programming career safely insulated from having to think about any of the details of the machine, and now you want to dig a little deeper and find out the real reason that Python script you rewrote in Rust runs 100x faster and uses a 10th of the memory. After all, they both do the same thing and run on the same CPU, right?
So, here's an optimization guide, aimed at those who know how to program but maybe don't know how it maps to real ones and zeroes on the bare metal of your CPU. I'll try to weave practical tips about optimizing Rust code with explanations of the reason why it's faster than the alternative, and we'll end with a case study from the Rust standard library.
This post assumes decent familiarity with programming, a beginner's familiarity with Rust and almost no familiarity with CPU architecture.