I didn’t particularly feel like copying and pasting all of the numbers out one at a time, so I decided to try something different: could I record a

Simon Willison’s Weblog

submited by
Style Pass
2024-10-17 13:00:07

I didn’t particularly feel like copying and pasting all of the numbers out one at a time, so I decided to try something different: could I record a screen capture while browsing around my Gmail account and then extract the numbers from that video using Google Gemini?

I recorded the video using QuickTime Player on my Mac: File -> New Screen Recording. I dragged a box around a portion of my screen containing my Gmail account, then clicked on each of the emails in turn, pausing for a couple of seconds on each one.

You should never trust these things not to make mistakes, so I re-watched the 35 second video and manually checked the numbers. It got everything right.

I had intended to use Gemini 1.5 Pro, aka Google’s best model... but it turns out I forgot to select the model and I’d actually run the entire process using the much less expensive Gemini 1.5 Flash 002.

And in fact, it was free. Google AI Studio currently “remains free of charge regardless of if you set up billing across all supported regions”. I believe that means they can train on your data though, which is not the case for their paid APIs.

Leave a Comment