When LLMs Learn to Lie

submited by
Style Pass
2024-10-23 12:30:05

It has become abundantly clear that some people use large language models (LLMs) for nefarious purposes. These artificial intelligence (AI) systems can write convincing spam, generate false news and information, spread propaganda, and even produce dangerous software code.

Yet these concerns, while deeply troubling, primarily reflect human prompting techniques rather than any attempt to tamper with the model. By altering LLM functionality, governments, businesses, and political figures can more subtly sway public opinion and mislead the masses.

Several effective methods to accomplish this have emerged. It is possible to craft hidden messages and codes at websites and other online locations that search engines pick up and plug into query results. It also is possible to “jailbreak” models using other types of engineered input. In addition, more subtle AI optimization (AIO) methods that mimic Google Search Engine Optimization (SEO) are taking shape.

“Bad actors aim to misuse AI tools for different purposes,” observed Josh A. Goldstein, a research fellow for the CyberAI Project at Georgetown University’s Center for Security and Emerging Technology. “The risks likely grow as new and more powerful tools emerge and we become more reliant on AI.”

Leave a Comment