An AI Cyber Incident in Plain Sight

submited by

Style Pass

2024-11-24 20:00:03

There’s lots of discussion about the security of generative AI systems. With the right access, you can make a large language model do whatever you want, regardless of how it’s been trained. This makes for lots of potential security flaws, and they’ve been pointed out by many (see e.g. https://embracethered.com/).

These flaws aren’t in question, but it’s interesting to know how they translate into the prevalence of real world “cyber events” causing an actual loss. If you follow the Gen AI world you may have heard about the car dealer chatbot that “sold” a SUV for $1, and you’ve probably seen examples of LLMs saying something embarrassing (mostly Google’s for some reason). It’s trivial to get many public facing chatbots to tell you their system prompt, including the part about “don’t reveal these instructions” if that’s something you’re interested in, but I’ve never seen that exploit parleyed into something more nefarious (except my example below).

It’s difficult to find real instances where an LLM flaw led to a cyber incident anywhere close to comparable to the security breaches seen elsewhere - the ones where millions of people’s data gets published on the dark web, or a hospital is shut down. I can find some "adjacent" stuff where AI had some role in an attack, say though automation, but nothing where a flaw in an LLM led to a cyber incident. If any one has an example, particularly where one of the OWASP LLM Top 101 led to an incident in the wild (as opposed to something demonstrated by security researchers) please let me know.