Google is reportedly developing a ‘computer-using agent’ AI system
Image: The Verge
Google could preview its own take on Rabbit’s large action model concept as soon as December, reports The Information. “Project Jarvis,” as it’s reportedly codenamed, would carry tasks out for users, including “gathering research, purchasing a product, or booking a flight,” according to three people the outlet spoke with who have direct knowledge of the project.
Powered by a future version of Google’s Gemini, Jarvis reportedly only works with a web browser (it’s tuned specifically for Chrome). The tool is aimed at helping people “automate everyday, web-based tasks” by taking and interpreting screenshots and then clicking buttons or entering text, The Information writes. In its current state, it apparently takes “a few seconds” between actions.
The biggest AI companies are all working on models that do things like what The Information is describing. Microsoft’s Copilot Vision will let you talk with it about webpages you’re viewing. Apple Intelligence is expected to be aware of what’s on your screen and do things for you across multiple apps at some point in the next year. Anthropic debuted a “cumbersome and error-prone” Claude beta update that can use a computer for you, and OpenAI is reportedly working on a version of that, too.
The Information cautions that Google’s plan to show Jarvis off in December is subject to change. The company is reportedly considering releasing it to some small number of testers to find and help the company work out bugs.
Image: The Verge
Google could preview its own take on Rabbit’s large action model concept as soon as December, reports The Information. “Project Jarvis,” as it’s reportedly codenamed, would carry tasks out for users, including “gathering research, purchasing a product, or booking a flight,” according to three people the outlet spoke with who have direct knowledge of the project.
Powered by a future version of Google’s Gemini, Jarvis reportedly only works with a web browser (it’s tuned specifically for Chrome). The tool is aimed at helping people “automate everyday, web-based tasks” by taking and interpreting screenshots and then clicking buttons or entering text, The Information writes. In its current state, it apparently takes “a few seconds” between actions.
The biggest AI companies are all working on models that do things like what The Information is describing. Microsoft’s Copilot Vision will let you talk with it about webpages you’re viewing. Apple Intelligence is expected to be aware of what’s on your screen and do things for you across multiple apps at some point in the next year. Anthropic debuted a “cumbersome and error-prone” Claude beta update that can use a computer for you, and OpenAI is reportedly working on a version of that, too.
The Information cautions that Google’s plan to show Jarvis off in December is subject to change. The company is reportedly considering releasing it to some small number of testers to find and help the company work out bugs.