A Brief History of Markov Chains

submited by
Style Pass
2021-09-26 04:30:05

One of the most common simple techniques for generating text is a Markov chain. The algorithm takes an input text or texts, divides it into tokens (usually letters or words), and generates new text based on the statistics of short sequences of those tokens. If you're interested in the technical details, Allison Parrish has a good tutorial on implementing Markov chains in Python.

This technique has been around for a very long time, but it's slightly unclear when it was first developed. The following is a slightly sketchy work-in-progress timeline of what I've been able to work out.

1907-1913: Andrei Markov develops the mathematical theory of Markov chains, using as an example an analysis of the patterns of vowels and consonants in Pushkin's poem Eugene Onegin. (link)

1948: Claude Shannon publishes A Mathematical Theory of Communication, laying the foundations of information theory. As an example, he manually generates samples of text ("the series of approximations to English") using a Markov chain process ("Markoff process"), based on an unknown source text ("a book"). (link)

Leave a Comment