Optimising string processing in Rust

submited by
Style Pass
2023-03-18 21:30:05

In this article, I'd like to explore how to process strings faster in Rust. I'll take the example of a function to escape the HTML <, > and & characters, starting from a naive implementation and trying to make it faster.

Warning: I'm not an expert in this domain. While I do have a computer science background, I haven't had a job in this domain for nearly ten years, I am not an expert of Rust, and certainly not an expert in optimising stuff, so I'm absolutely not saying that I'm going to show you the fastest way to solve this problem; I mainly wanted to share my experience and what I learned by looking at that.

The problem is, broadly speaking, the following: we have a string containing some text, we want to do some stuff to it in some circumstances, and return another string. I'll take the specific example of escaping HTML characters: we don't want the text we are displaying to cause problem in our browser, so we escape the following characters:

We are only escaping those because we assume non-adversarial content (the use case is converting local markdown files to HTML); clearly, more rules would be needed if we wanted to escape the input of a non trusted user, in which case it might be a good idea to use an existing library that prevents e.g. XSS  attacks.

Leave a Comment