gpt-oss

submited by

Style Pass

2025-08-05 20:00:16

34.8K Downloads Updated 58 minutes ago

Ollama partners with OpenAI to bring its latest state-of-the-art open weight models to Ollama. The two models, 20B and 120B, bring a whole new local chat experience, and are designed for powerful reasoning, agentic tasks, and versatile developer use cases.

OpenAI utilizes quantization to reduce the memory footprint of the gpt-oss models. The models are post-trained with quantization of the mixture-of-experts (MoE) weights to MXFP4 format, where the weights are quantized to 4.25 bits per parameter. The MoE weights are responsible for 90+% of the total parameter count, and quantizing these to MXFP4 enables the smaller model to run on systems with as little as 16GB memory, and the larger model to fit on a single 80GB GPU.

Ollama is supporting the MXFP4 format natively without additional quantizations or conversions. New kernels are developed for Ollama’s new engine to support the MXFP4 format.

gpt-oss

Leave a Comment

Related Posts

Recent Posts

A first look at GPT-OSS-120B’s coding ability

Why Should We Worry About Declining Birth Rates?

The modern USD account built for global businesses

The mystery of Alice in Wonderland syndrome

Simon Willison’s Weblog

Generalization Gap in Over‑Parameterized Models

Search code, repositories, users, issues, pull requests...

Denmark zoo asks for people to donate their pets to feed its predators

How much is a pension worth? - venki.dev

Computer Science > Artificial Intelligence

Aarne–Thompson–Uther Index

Create AI storybooks with illustrations in the Gemini app

Tame Software Project Complexity with AI: How I Use Cursor to Build a Smarter Knowledge Base

Writing code was never the bottleneck!

Start the presses! New York Post will expand to LA with launch of The California Post

Hammer Time: Scientists Have Figured Out Why Hammerheads Love Eating Other Sharks

What’s the “Points” of Agile, Anyway?

User Interfaces in Agentic CLI Tools: What Developers Need

Canadian Court Rejects Reverse Class Action Lawsuit Against BitTorrent Pirates

AGI Multimodal Cognition Blueprint Expanded - 424 pages - complete version - final