Cheap AI “video scraping” can now extract data from any screen recording

Researcher feeds screen recordings into Gemini to extract accurate information with ease.

Recently, AI researcher Simon Willison wanted to add up his charges from using a cloud service, but the payment values and dates he needed were scattered among a dozen separate emails. Inputting them manually would have been tedious, so he turned to a technique he calls “video scraping,” which involves feeding a screen recording video into an AI model, similar to ChatGPT, for data extraction purposes.

What he discovered seems simple on its surface, but the quality of the result has deeper implications for the future of AI assistants, which may soon be able to see and interact with what we’re doing on our computer screens.

“The other day I found myself needing to add up some numeric values that were scattered across twelve different emails,” Willison wrote in a detailed post on his blog. He recorded a 35-second video scrolling through the relevant emails, then fed that video into Google’s AI Studio tool, which allows people to experiment with several versions of Google’s Gemini 1.5 Pro and Gemini 1.5 Flash AI models.

Read full article

Comments

ars-rss

Recent Posts

Recent Comments

The MacRumors Show: iPhone 17 Designs Revealed!

Riot Games is cracking down on players’ off-platform conduct

Pepeto and Pepe Unchained Introduce zero fee trading and cross chain solutions vs layer 2 tech

Categories

Archives

Recent Posts

Recent Comments

The MacRumors Show: iPhone 17 Designs Revealed!

Riot Games is cracking down on players’ off-platform conduct

Pepeto and Pepe Unchained Introduce zero fee trading and cross chain solutions vs layer 2 tech

Categories

Archives

Cheap AI “video scraping” can now extract data from any screen recording

Leave a Reply Cancel reply

Archives

Categories