By Wes Davis , a weekend editor who covers the latest in tech and entertainment. He has written news, reviews, and more as a tech journalist since 2020.
Google could preview its own take on Rabbit’s large action model concept as soon as December, reports The Information. “Project Jarvis,” as it’s reportedly codenamed, would carry tasks out for users, including “gathering research, purchasing a product, or booking a flight,” according to three people the outlet spoke with who have direct knowledge of the project.
Powered by a future version of Google’s Gemini, Jarvis reportedly only works with a web browser (it’s tuned specifically for Chrome). The tool is aimed at helping people “automate everyday, web-based tasks” by taking and interpreting screenshots and then clicking buttons or entering text, The Information writes. In its current state, it apparently takes “a few seconds” between actions.
The biggest AI companies are all working on models that do things like what The Information is describing. Microsoft’s Copilot Vision will let you talk with it about webpages you’re viewing. Apple Intelligence is expected to be aware of what’s on your screen and do things for you across multiple apps at some point in the next year. Anthropic debuted a “cumbersome and error-prone” Claude beta update that can use a computer for you, and OpenAI is reportedly working on a version of that, too.