(Right-Nulled) Generalised LR Parsing | Whatever

submited by
Style Pass
2025-01-12 14:30:07

I hope you know a bit about LR parsing, otherwise this blog post won’t make much sense to you. You can read all about it in a previous post of mine. Today I want to discuss the problems with getting your language parsed in LR(1), or even LR(k). And how an old way to solve those problems is with a more powerful algorithm, that can parse any context-free grammar, no restrictions, no complaints about conflicts.

Now in theory, any deterministic[1] context-free language can be parsed with an LR(1) grammar. Writing one might be difficult though. If you can write an LR(k) grammar instead, you can mechanically transform it into an LR(1) grammar. But that doesn’t necessarily mean that your resulting grammar is readable. If it’s not very readable, like with any programming artefact, it’s going to be a pain in your behind at some point. Because you’ll make mistakes, or you’ll want to change it, and now you need to understand what’s going on again. While you may be able to describe your language within the restrictions of an LR(1) grammar, the encoding required (manual or automatic) will make your grammar less readable. There’s a reason why some programming language’s manuals have a “high level” or “natural” grammar describing the language and it’s intuitive structure, and separately an executable grammar that fits in some grammar class. Another complication is that the output of a parser generated from a grammar follows that grammar. And so, a worse grammar makes for a worse experience using the parser output.

For these reasons, generalised parsing is a popular prototyping tool, since it allows you to work with the natural grammar. Consider the following two grammars that describe the same language:

Leave a Comment