One of my colleagues was experimenting with Rust. He started by writing a sudoku solver which he has already written in C before. Once he was complete

Rust zero-cost abstraction in action

submited by
Style Pass
2021-06-08 03:00:07

One of my colleagues was experimenting with Rust. He started by writing a sudoku solver which he has already written in C before. Once he was completed writing it in Rust, he was very disappointed because Rust version was twice as fast than the C version which was hand-optimised by pulling off all the tricks he knew to make it perform well. He eventually managed to make the C version as fast as the Rust version by removing the intrinsics.

I don’t know anything about reading x86 opcodes other the basics but I can confidently tell that I don’t see any loops here.

RDI register is where our argument n is stored. RAX register is where the return result is stored. So if we replace RAX with RETURN and RDI with N then the generated assembly roughly looks like this:

which essentially calculates the formula: (N-2)*(N-3)/2 + 2*N - 3 which can be simplified to N*(N-1)/2. That's the formula used for summing numbers between 1 to N!

Leave a Comment