OpenAI is teaching its AI systems to align with human intent and values. Which and whose values, exactly? Can ChatGPT pass the famous science fiction empathy test? Let’s ask the questions!
OpenAI is teaching their models not only human intent, but also human values. This goal is part of their core mission: alignment research. Here’s what OpenAI says about it:
Aligning AI systems with human values also poses a range of other significant sociotechnical challenges, such as deciding to whom these systems should be aligned.
To achieve this goal, OpenAI used human raters to judge model responses, and then reinforcement learning for ChatGPT to start behaving better to please the raters. You can read more about the technical details in my previous blog, “How Disruptive Is ChatGPT And Why?”.
Voight-Kampff test is a concept from a famous science fiction movie, Blade Runner starring Harrison Ford, based on Philip K. Dick’s book “Do Androids Dreem of Electric Sheep”. It is an empathy test, designed to distinguish human-like androids from actual humans.