A few days ago, I saw a cluster of tweets about OpenAI's o1 randomly switching to Chinese while reasoning -- here's a good example. I think I've seen it switch languages a few times as well. Thinking about it, Chinese -- or any other language written in a non-Latin alphabet -- would be particularly noticeable, because those notes describing what it's thinking about flash by pretty quickly, and you're only really likely to notice something weird if it's immediately visibly different to what you expect. So perhaps it's spending a lot of its time switching from language to language depending on what it's thinking about, and then it translates back to the language of the conversation for the final output.
Why would it do that? Presumably certain topics are covered better in its training set in specific languages -- it will have more on Chinese history in Chinese, Russian history in Russian, and so on. But equally possibly, some languages are easier for it to reason about certain topics in. Tiezhen Wang, a bilingual AI developer, tweeted that he preferred doing maths in Chinese "because each digit is just one syllable, which makes calculations crisp and efficient". Perhaps there's something similar there for LLMs.
That got me thinking about the 17th-century idea of a Philosophical Language. If you've read Neal Stephenson's Baroque Cycle books, you'll maybe remember it from there -- that's certainly where I heard about it. The idea was that natural human languages were not very good for reasoning about things, and the solution would be to create an ideal, consciously-designed language that was more rational. Then philosophers (or, scientists as we'd say these days) could work in it and get better results.