A lot of models have been released in the last few weeks: Kimi K2, Qwen3-Coder, GLM-4.5, gpt-oss, Claude Opus 4.1, diffusion models, and there's no end in sight.
In Amp, a new model isn't just another entry in a model selection dropdown menu. It's part of a whole in which many different models have different jobs to do and for each job we want to use the best model, regardless of cost or deployment concerns. So when a new model comes along, we ask:
With this post, we want to show you a week in the life of the Amp team as we evaluate new models. Impressions, ideas, tips — we'll share what we discover.
This morning I took GPT-5 out for a proper spin, not just testing it, but actually putting it to use, trying to fix a bug in the Amp CLI. This time I did something I usually don't do: I used voice dictation and ended up with a long, rambly prompt that contained a lot of redundant information.
To my (and everyone's who was in hearing distance here in the office) surprise, GPT-5 fixed the bug in a single turn. Flawlessly. I committed the code just as GPT-5 wrote it.