A friend once quipped to me that “computer science is entirely about sorting and searching”.  While that’s a gross overgeneralizatio

Regular Expression Improvements in .NET 7

submited by
Style Pass
2022-05-12 22:30:07

A friend once quipped to me that “computer science is entirely about sorting and searching”. While that’s a gross overgeneralization, there’s a grain of truth to it. Searching is, in one way, shape, or form, at the heart of many workloads, and it’s so important that multiple domain-specific languages have been created over the years to ease the task of expressing searches. Arguably none is more ubiquitous than regular expressions.

A regular expression, or regex, is a string that enables a developer to express a pattern being searched for, making it a very common way to search text and to extract from the results key finds. Every major development platform has one or more regex libraries, either built into the platform or available as a separate library, and .NET is no exception. .NET’s System.Text.RegularExpressions namespace has been around since the early 2000s, introduced as part of .NET Framework 1.1, and is used by thousands upon thousands of .NET applications and services.

At the time it was introduced, it was a state-of-the-art design and implementation. Over the years, however, it didn’t evolve significantly, and it fell behind the rest of the industry. This was rectified in .NET 5, where we re-invested in making Regex very competitive, with many improvements and optimizations to its implementation (elaborated on in Regex Performance Improvements in .NET 5). However, those efforts didn’t expand much upon its functionality. Now with .NET 7, we’ve again heavily invested in improving Regex, for performance but also for significant functional enhancements.

Leave a Comment