We use strings for identifiers, human writing, structured data, and grammars. If you instead use symbols for identifiers then you can be more confiden

Strings do too many things

submited by
Style Pass
2024-02-10 02:30:04

We use strings for identifiers, human writing, structured data, and grammars. If you instead use symbols for identifiers then you can be more confident a given string isn't an identifier.

(The line between "structured data" and "grammar" is really fuzzy; is a CSV data or grammar? Maybe making a distinction isn't useful, but they feel different to me)

When you see a string in code, you want to know what kind of string it is. We use these strings for different purposes and we want to do different things to them. We might want to upcase or downcase identifiers for normalization purposes, but we don't split or find substrings in them. But you can do those operations anyway because all operations are available to all strings. It's like how if you store user ids as integers, you can take the average of two ids. The burden of using strings properly is on the developer.

Almost all programming languages have an ASCII grammar but they need to support Unicode strings, because human writing needs Unicode. How do you lexographically sort a list of records when one of them starts with "πŽ„πŽπŽ›πŽ"?

Leave a Comment