Rosie Pattern Language

submited by
Style Pass
2024-09-27 21:00:05

In brief: RPL is an alternative to regex, providing a better syntax, unit tests, and packages of patterns, among other benefits.

The Rosie project began at IBM within a large machine learning project that required feature extraction from hundreds of text-based data sources. We had too many regular expressions to manage, and found them difficult to test, maintain, and share.

Sharing was particularly important because we wanted to harvest regexes already in use within products whose data we were mining. But regexes are not very portable across languages. Nor do they compose well, making it challenging if not impossible to reliably build larger expressions out of smaller ones.

I left IBM in 2018 and brought the project to North Carolina State University, where it has thrived with contributions from undergraduate and graduate students.

Leave a Comment