Huge announcment from Google this morning: Introducing Gemini 2.0: our new AI model for the agentic era. There’s a ton of stuff in there (including

Simon Willison’s Weblog

submited by

Style Pass

2024-12-11 20:30:03

Huge announcment from Google this morning: Introducing Gemini 2.0: our new AI model for the agentic era. There’s a ton of stuff in there (including updates on Project Astra and the new Project Mariner), but the most interesting pieces are the things we can start using today, built around the brand new Gemini 2.0 Flash model. The developer blog post has more of the technical details.

Gemini 2.0 Flash is a multi-modal LLM. Google claim it’s both more capable and twice as fast as Gemini 1.5 Pro, their previous best model.

The new Flash can handle the same full range of multi-modal inputs as the Gemini 1.5 series: images, video, audio and documents. Unlike the 1.5 series it can output in multiple modalities as well—images and audio in addition to text. The image and audio outputs aren’t yet generally available but should be coming early next year.

I released llm-gemini 0.7 adding support for the new model to my LLM command-line tool. You’ ll need a Gemini API key—then install LLM and run: