Google’s RT-2 AI model brings us one step closer to WALL-E

“First-of-its-kind” robot AI model can recognize trash and perform complex actions.

Enlarge / A Google robot controlled by RT-2. (credit: Google)

On Friday, Google DeepMind announced Robotic Transformer 2 (RT-2), a “first-of-its-kind” vision-language-action (VLA) model that uses data scraped from the Internet to enable better robotic control through plain language commands. The ultimate goal is to create general-purpose robots that can navigate human environments, similar to fictional robots like WALL-E or C-3PO.

When a human wants to learn a task, we often read and observe. In a similar way, RT-2 utilizes a large language model (the tech behind ChatGPT) that has been trained on text and images found online. RT-2 uses this information to recognize patterns and perform actions even if the robot hasn’t been specifically trained to do those tasks—a concept called generalization.

For example, Google says that RT-2 can allow a robot to recognize and throw away trash without having been specifically trained to do so. It uses its understanding of what trash is and how it is usually disposed to guide its actions. RT-2 even sees discarded food packaging or banana peels as trash, despite the potential ambiguity.

Read 10 remaining paragraphs | Comments

ars-rss

Recent Posts

Recent Comments

Wordle today: Answer, hints for November 18

NYT Connections hints today: Clues, answers for November 18

New Dune Prequel ‘Dune: Prophecy’ Premieres on HBO and Max

Categories

Archives

Recent Posts

Recent Comments

Wordle today: Answer, hints for November 18

NYT Connections hints today: Clues, answers for November 18

New Dune Prequel ‘Dune: Prophecy’ Premieres on HBO and Max

Categories

Archives

Google’s RT-2 AI model brings us one step closer to WALL-E

Leave a Reply Cancel reply

Archives

Categories