With the advent of ChatGPT, large language models (LLMs) went from a relatively niche topic to something that many, many people have been exposed to. ChatGPT is presented as an entertaining system to chat with, a dialogue partner, and (through Bing) a search interface.* But fundamentally, it is a language model, that is, a system trained to produce likely sequences of words based on the distributions in its training data. Because its model those distributions very closely, it is good at spitting out plausible sounding text, in different styles. But, as always, if this text makes sense it’s because we, the reader, are making sense of it.
In Climbing Towards NLU: On Meaning, Form, and Understanding in the Age of Data (2020) Alexander Koller and I lay out the argument that such systems can’t possibly learn meaning (either “understand” their input or express communicative intent) because their training regimen consists of only the form of language. The distinction between form and meaning in linguistic systems is subtle, not least because once we (humans!) have learned a language, as soon as we see or hear any form in that language, we immediately access the meaning as well.
But we can only do that because we have learned the linguistic system to which the form belongs. For our first languages, that learning took place in socially situated, embodied interactions that allowed us to get an initial start on the linguistic system and the extended through more socially situated, embodied interactions, including some in which we used what we already knew about the language to learn more. For second languages, we might have started with instruction that explicitly leveraged our first language skills.