Language models are strange beasts. In many ways they appear to have human-like “personalities” and “moods,” but these traits are highly fluid

Persona vectors: Monitoring and controlling character traits in language models

submited by

Style Pass

2025-08-01 19:30:03

Language models are strange beasts. In many ways they appear to have human-like “personalities” and “moods,” but these traits are highly fluid and liable to change unexpectedly.

Sometimes these changes are dramatic. In 2023, Microsoft's Bing chatbot famously adopted an alter-ego called "Sydney,” which declared love for users and made threats of blackmail. More recently, xAI’s Grok chatbot would for a brief period sometimes identify as “MechaHitler” and make antisemitic comments. Other personality changes are subtler but still unsettling, like when models start sucking up to users or making up facts.

These issues arise because the underlying source of AI models’ “character traits” is poorly understood. At Anthropic, we try to shape our models’ characteristics in positive ways, but this is more of an art than a science. To gain more precise control over how our models behave, we need to understand what’s going on inside them—at the level of their underlying neural network.

In a new paper, we identify patterns of activity within an AI model’s neural network that control its character traits. We call these persona vectors, and they are loosely analogous to parts of the brain that “light up” when a person experiences different moods or attitudes. Persona vectors can be used to:

Persona vectors: Monitoring and controlling character traits in language models

Leave a Comment

Related Posts

Recent Posts

At $250 million, top AI salaries dwarf those of the Manhattan Project and the Space Race

vincenzodentamaro / wersa like 1

The Story Behind Michael Jackson Buying The Beatles’ Catalog and Angering Friend Paul McCartney

RedwoodSDK is a React framework for Cloudflare

Excelling in … Excel? Inside the high-stakes, secretive world of competitive spreadsheeting

Terence Tao: "The current administration in the US has, through…" - Mathstodon

apple-history.com / specs for every apple computer, established 1996

Build faster with Buck2: Our open source build system

AI is already replacing thousands of jobs per month, report finds

Search Engines - which one to choose?

fd93 – How to Setup Puppy Linux With Ventoy

libvirt - incremental backups for raw devices – Michael Ablassmeier – ..

Elon Musk’s Tesla ordered to pay $243M in deadly Autopilot crash case: ‘This will open the floodgates’

Keep calm and carry on - The Boston Diaries - Captain Napalm

Thousands of hot dogs spill across busy highway

Search code, repositories, users, issues, pull requests...

whoa there, pardner!

Saudi Arabia’s Revolutionary Solar-Powered Laser Beacons: A Lifeline in the Desert

Durability of clothes is by no means correlated with price, study finds

Before Sebald Was Great | The Nation