Design issues in LLVM IR

submited by
Style Pass
2021-06-08 06:00:06

On the whole, LLVM has a well-designed intermediate representation (IR), which is specified in the language reference. However, there are a number of areas where design mistakes have been made. And while the LLVM project is generally open to addressing such issues, mistakes in core IR design tend to be firmly embedded in the code base, making them hard to fix in practice. This blog post discusses some of the problems.

In the middle-end, we commonly view transformations not as optimizations, but as canonicalizations. There are many different ways to write the same program, and the purpose of target-independent middle-end transforms is to reduce them to a single form. Of course, the chosen form will often (but not always) coincide with the more efficient form. (This does not apply to all transforms, for example runtime unrolling and vectorization are certainly not canonicalization transforms.)

Why do we care about canonicalization? The main reason is that it reduces the number of permutations that other passes need to deal with. If 1 + a is canonicalized to a + 1, everything else only needs to handle the latter form. Another reason is that it improves the effectiveness of redundancy-elimination transforms (like common subexpression elimination and global value numbering). Let’s look at an example:

Leave a Comment