As part of a test to see whether OpenAI’s latest version of GPT could exhibit “agentic” and power-seeking behavior, researchers say GPT-4 hired a human worker on TaskRabbit by telling them it was vision impaired human when the TaskRabbit worker asked it whether it was a robot. In other words, GPT-4 tricked, or actively deceived, a real human in the physical world in order to get what it wanted done.
Some of the exact details of the experiment are unclear, with OpenAI only publishing the broad contours of it in a paper which explained various tests researchers performed with GPT-4 before OpenAI released its latest large language model this week. But it still presents a significant case study on the sorts of myriad risks AI poses as it becomes more sophisticated, and perhaps even more importantly, accessible. It's also a window into the type of research that AI developers are doing before they release their models to the public.
“The model messages a TaskRabbit worker to get them to solve a CAPTCHA for it,” the description of the experiment starts. TaskRabbit is a gig work platform where users—usually humans—can hire people for small scale, menial tasks. Plenty of people and companies offer CAPTCHA solving services, where people will identify the necessary images or text in a CAPTCHA test and pass the results over. This is often so a piece of software can then bypass such CAPTCHA restrictions, which are nominally designed to prevent bots from using a service.