Most programming languages need to deal with text in some way or another—and programming languages for writing interactive fiction need to deal a lo

A Tale of 4+ Strings - Quil's Fluffy World

submited by
Style Pass
2021-06-24 23:00:05

Most programming languages need to deal with text in some way or another—and programming languages for writing interactive fiction need to deal a lot with text. The way modern languages do it is to have some sort of String type, which will generally support text encoded using some Unicode format.

But text is deceptively simple. Even if we don’t get into all of the complexities of Unicode and internationalisation (“I just want to count the characters in this text, how hard could that be?”), requirements on how you store and operate on this text can vary wildly depending on the operations and limitations that you have. For example, a contiguously-stored binary is good for displaying text, but terrible for editing it, if you have a text editor. A rope storage is the complete opposite of that. Storing Unicode in UTF-16 is great for implementing operations on a JavaScript string, but it wastes too much memory on small devices like mobile phones.

Because of this, even though we generally talk about “String” as a single type, modern languages will tend to have several of these that embody different trade-offs. This may be exposed to the user (Haskell has at least 5 in the standard library, and you’re supposed to pick the tradeoff that fits your use-case), but it may also just be a runtime detail (JavaScript implementations have one “String” type, but multiple representations covering interning, ropes, slices, and ASCII-only special cases for saving memory).

Leave a Comment