Just for fun, I used OpenAlex to chart the usage of some of those words from 1990 to 2024 in the context of academic writing. It's important to note h

Generative AI is polluting language

submited by
Style Pass
2024-09-23 16:30:17

Just for fun, I used OpenAlex to chart the usage of some of those words from 1990 to 2024 in the context of academic writing. It's important to note here that widespread use of ChatGPT came in 2023, which happens to be the very year these charts take a striking jump in the frequency of LLM words like delve, leverage, or tapestry.

Source: openalex.org An alarming number of scientific papers are being written with ChatGPT. All language and word usage data are now irreversibly polluted. What effect will this have on language development? Will human and LLM use of language simply freeze and stop developing, remaining forever in a limbo of word choices designed to convey an air of cultivated authoritativeness? Perhaps this doesn't necessarily point to the over-use of AI in academic paper writing. Maybe it only points to writers being influenced by AI in their word choices. Either way, the effect is the same. We lead the AI, the AI leads us. Eventually, LLMs will be trained on data generated by other LLMs, and they probably already are. At some point, the training data will be a majority LLM-generated slop. If LLM word choices influence how humans use language, what happens if AI-generated content slowly collapses into incoherency due to being trained on LLM-generated data? Will it stagnate human use of language, or regress it to the point of meaninglessness? AI has polluted word use data so much that projects to gather natural language word usage data are starting to give up. We've reached an inflection point of word pollution that we are neither prepared for, nor able to understand the long-term implications of. "...your scientists were so preoccupied with whether or not they could that they didn't stop to think if they should." I've mused for years that people under 30 seem to think they should reinvent the wheel, culturally speaking. The young seem to want to discard the ways of their forebears and start from scratch, discovering their own way of doing things. Now, in the post-LLM world, all who create without the use of generative AI truly are at the precipice, wandering in an uncharted wilderness, and carrying language with them. If you're writing from your own brain, you are now at the forefront of human language. Behind us, the masses languish in their AI-generated slop language, generating and consuming, consuming and generating, being led by the machine even as they lead the machine. But us, we create because we must. We use language in novel ways because it is our condition, it is what we have no choice but to engage in. We are the last holdout of genuine humans using language to convey things to other humans. It's all uncharted territory now. ~[MD] Previous: Dwarf FortressNext: Structure vs Substance top ©2021- Craig Wilson. [ Buy me a coffee ] Feeds: xml | json

Source: openalex.org An alarming number of scientific papers are being written with ChatGPT. All language and word usage data are now irreversibly polluted. What effect will this have on language development? Will human and LLM use of language simply freeze and stop developing, remaining forever in a limbo of word choices designed to convey an air of cultivated authoritativeness? Perhaps this doesn't necessarily point to the over-use of AI in academic paper writing. Maybe it only points to writers being influenced by AI in their word choices. Either way, the effect is the same. We lead the AI, the AI leads us. Eventually, LLMs will be trained on data generated by other LLMs, and they probably already are. At some point, the training data will be a majority LLM-generated slop. If LLM word choices influence how humans use language, what happens if AI-generated content slowly collapses into incoherency due to being trained on LLM-generated data? Will it stagnate human use of language, or regress it to the point of meaninglessness? AI has polluted word use data so much that projects to gather natural language word usage data are starting to give up. We've reached an inflection point of word pollution that we are neither prepared for, nor able to understand the long-term implications of. "...your scientists were so preoccupied with whether or not they could that they didn't stop to think if they should." I've mused for years that people under 30 seem to think they should reinvent the wheel, culturally speaking. The young seem to want to discard the ways of their forebears and start from scratch, discovering their own way of doing things. Now, in the post-LLM world, all who create without the use of generative AI truly are at the precipice, wandering in an uncharted wilderness, and carrying language with them. If you're writing from your own brain, you are now at the forefront of human language. Behind us, the masses languish in their AI-generated slop language, generating and consuming, consuming and generating, being led by the machine even as they lead the machine. But us, we create because we must. We use language in novel ways because it is our condition, it is what we have no choice but to engage in. We are the last holdout of genuine humans using language to convey things to other humans. It's all uncharted territory now. ~[MD] Previous: Dwarf FortressNext: Structure vs Substance top ©2021- Craig Wilson. [ Buy me a coffee ] Feeds: xml | json

Leave a Comment