OpenAI's new open-source model is basically Phi-5

submited by

Style Pass

2025-08-07 19:30:12

OpenAI just released its first ever open-source1 large language models, called gpt-oss-120b and gpt-oss-20b. You can talk to them here. Are they good models? Well, that depends on what you’re looking for. They’re great at some benchmarks, of course (OpenAI would never have released them otherwise) but weirdly bad at others, like SimpleQA.

Some people really like them. Others on Twitter really don’t. From what I can tell, they’re technically competent but lack a lot of out-of-domain knowledge: for instance, they have broad general knowledge about science, but don’t know much about popular culture. We’ll know in six months how useful these models are in practice, but my prediction is that these models will end up in the category of “performs much better on benchmarks than on real-world tasks”.

In 2024, Sebastien Bubeck led the development of Microsoft’s open-source Phi-series of models2. The big idea behind those models was to train exclusively on synthetic data: instead of text pulled from books or the internet, text generated by other language models or hand-curated textbooks. Synthetic data is less common than normal data, since instead of just downloading terabytes of it for free you have to spend money to generate each token. But the trade-off is that you have complete control over your training data. What happens when you train a model on entirely high-quality synthetic and curated data?

OpenAI's new open-source model is basically Phi-5

Leave a Comment

Related Posts

Recent Posts

A genetically tractable non-vertebrate system to study complete camera-type eye regeneration

Flipper Zero DarkWeb Firmware Bypasses Rolling Code Security

Achieving 10,000x training data reduction with high-fidelity labels

Worktrees: Git's best kept secret (and why you should use them)

Elon Musk outlines AI-led Grok future for advertising on X

Conspiracy Theories & the Paranoid Style: Richard Hofstadter’s X-Ray of the American Mind

Build Your Following Without Living Inside Your Social Media Apps

Computer Science > Machine Learning

Eat Your Potatoes Mashed, Boiled or Baked, but Hold the Fries | Harvard Magazine

Apple executives have held internal talks about buying AI startup Perplexity

Is AI ‘The Ultimate Version of Google,’ As Larry Page Wanted?

ShWiM: peer-to-peer terminal sharing

Not enough sleep? Too much sleep? Does any of this make sense?

ARCNET: The Sleeping Giant

Global Fight Over Who Governs Communications Satellites Heats Up

Built With Borrowed Hands

Introducing the Amazon DynamoDB data modeling MCP tool

Computer Ads from the Past

GPT-5 vs Claude Opus 4.1

GPT-5's Router: how it works and why Frontier Labs are now targeting the Pareto Frontier