You’re likely reading this text in a browser. Press Ctrl+F (⌘+F on macOS) and search for the word "text" on this page. The browser will instantly show you how many times the word appears. Even in texts hundreds of times longer than this page, browsers can quickly find the desired substring. Today, we’ll look at the algorithms that make this possible.
Of course, these algorithms aren’t just used in browsers but also in text editors, in the grep utility, in standard libraries of programming languages and even in bioinformatics–at last, DNA is just a string, and genes are its substrings.
We’ll start with a simple naive algorithm, then gradually improve and refine it. But first, let’s agree on the terminology we’ll use. If you’re already familiar with basic string notations, alphabets, prefixes and suffixes, you can skip this section or return to it later when something will seem to be unclear.