OpenAI announced a new family of AI reasoning models on Friday, o3, which the startup claims to be more advanced than o1 or anything else it’s r

OpenAI trained o1 and o3 to ‘think’ about its safety policy

submited by

Style Pass

2024-12-23 04:30:03

OpenAI announced a new family of AI reasoning models on Friday, o3, which the startup claims to be more advanced than o1 or anything else it’s released. These improvements appear to have come from scaling test-time compute, something we wrote about last month, but OpenAI also says it used a new safety paradigm to train its o-series of models.

On Friday, OpenAI released new research on “deliberative alignment,” outlining the company’s latest way to ensure AI reasoning models stay aligned with the values of their human developers. The startup used this method to make o1 and o3 “think” about OpenAI’s safety policy during inference, the phase after a user presses enter on their prompt.

This method improved o1’s overall alignment to the company’s safety principles, according to OpenAI’s research. This means deliberative alignment decreased the rate at which o1 answered “unsafe” questions – at least ones deemed unsafe by OpenAI – while improving its ability to answer benign ones.

As AI models rise in popularity, and power, AI safety research seems increasingly relevant. But at the same time, it’s more controversial: David Sacks, Elon Musk, and Marc Andreessen say some AI safety measures are actually “censorship,” highlighting the subjective nature in these decisions.

German startup Aleph Alpha raises $27M Series A round to build ‘Europe’s OpenAI’

Comment

OpenAI can translate English into code with its new machine learning software Codex

Comment

Evan Hubinger on Effective Altruism and AI Safety

Comment

A Cockroach With AI-Enabled Search And Rescue Equipment

Comment

The U.S. may never regain its dominance as a destination for international students. Here's why that matters.

Comment

Sequoia Heritage, Stripe and others invest $200M in African fintech Wave at $1.7B valuation

Comment

Airbnb's party crackdown has blocked more than 100,000 bookings

Comment

Facebook, Twitter pledge to fight abuse of women but leave lots of room for failure

Comment

A popular algorithm to predict sepsis misses most cases and sends frequent false alarms, study finds

Comment

Ethereum Co-Founder Says Safety Concern Has Him Quitting Crypto

Comment

OpenAI trained o1 and o3 to ‘think’ about its safety policy

Leave a Comment

Related Posts

German startup Aleph Alpha raises $27M Series A round to build ‘Europe’s OpenAI’

OpenAI can translate English into code with its new machine learning software Codex

Evan Hubinger on Effective Altruism and AI Safety

A Cockroach With AI-Enabled Search And Rescue Equipment

The U.S. may never regain its dominance as a destination for international students. Here's why that matters.

Sequoia Heritage, Stripe and others invest $200M in African fintech Wave at $1.7B valuation

Airbnb's party crackdown has blocked more than 100,000 bookings

Facebook, Twitter pledge to fight abuse of women but leave lots of room for failure

A popular algorithm to predict sepsis misses most cases and sends frequent false alarms, study finds

Ethereum Co-Founder Says Safety Concern Has Him Quitting Crypto

Recent Posts

Use collective intelligence to make better decisions

Honda and Nissan officially begin merger talks to create world's third-largest automaker

AI promises a new utopia – but will the workless be welcome?

Code as Doc: Automate by Vercel AI SDK and ZenStack for Free

How CockroachDB Implements UDFs and SPs

The Rise of Antihumanism

Nuclear Power Plants Report Massive Uptick In Drone Sightings

44 crates - lost for decades - were unsealed. Inside was a fair dreamed up by art legends

Search code, repositories, users, issues, pull requests...

Search code, repositories, users, issues, pull requests...

How Many Hours Can You Code?

The Machine - Robotics & AI Helping Reclaim Waste Wood - Urban Machine

Sponsor Me | Yaksh Bariya

Beating ARC the hard way

Fastmail Advent 2024

Which Low Taper Fade suits you best

The Year in Computer Science

Computer Science > Software Engineering

The Nvidia Way - by Babbage - The Chip Letter

News Releases | Washington State Department of Agriculture