I have an unusually good memory, especially for symbols, words, and text, but since I don’t use regular expressions (ahem) regularly, they’re one of those parts of computer programming and HTML/EPUB editing that I find myself relearning over and over each time I need it. How did something this arcane but powerful even get started? Naturally, its creators were trying to discover (or model) artificial intelligence.
That’s the crux of this short history of “regex” by Buzz Andersen over at “Why is this interesting?”
The term itself originated with mathematician Stephen Kleene. In 1943, neuroscientist Warren McCulloch and logician Walter Pitts had just described the first mathematical model of an artificial neuron, and Kleene, who specialized in theories of computation, wanted to investigate what networks of these artificial neurons could, well, theoretically compute.
In a 1951 paper for the RAND Corporation, Kleene reasoned about the types of patterns neural networks were able to detect by applying them to very simple toy languages—so-called “regular languages.” For example: given a language whose “grammar” allows only the letters “A” and “B”, is there a neural network that can detect whether an arbitrary string of letters is valid within the “A/B” grammar or not? Kleene developed an algebraic notation for encapsulating these “regular grammars” (for example, a*b* in the case of our “A/B” language), and the regular expression was born.