Denizen is a CMS for personal websites. I want to work on multi-language support – people should be able to make websites and posts in any language they want. (My own blog is in English, Turkish, and Toki Pona – so this is a blocker for eating my dog food).
Every aspect of this feels like something other people must have already solved, so I wanted to publicly ask if anyone has the secret right answers.
Requirements: Users should be able to set their site to “any language”. I know that’s impossible to define, of course. I’m happy to settle for “every language with an ISO-639 code” including conlangs like Toki Pona (tok), which I want for my own site. However, I can’t find a good list anywhere. Wikipedia has one that only lists two-letter codes, and another that’s split across 26 pages with quite a bit of missing data. Ideally, I could just download a big JSON file of every assigned code, with each language’s native and English names, as well as labels for the macrolanguages and non-language codes like ‘zxx’.
In addition to marking up the website and the posts in it, the language setting would be used for Denizen’s UI including phrases in the public blog like “reply to” or “last updated”. Since Denizen can’t possibly be translated to every ISO-639 language, it needs a fallback mechanism.