Simon Willison’s Weblog

submited by

Style Pass

2024-10-30 12:30:04

I try to publish weeknotes at least once every two weeks. It’s been four since the last entry, so I guess this one counts as monthnotes instead.

In my defense, the reason I’ve fallen behind on weeknotes is that I’ve been publishing a lot of long-form blog entries this month.

A lot of LLM stuff happened. OpenAI had their DevDay, which I used as an opportunity to try out live blogging for the first time. I figured out video scraping with Google Gemini and generally got excited about how incredibly inexpensive the Gemini models are. Anthropic launched Computer Use and JavaScript analysis, and the month ended with GitHub Universe.

My big achievement of the month was finally shipping multi-modal support for my LLM tool. This has been almost a year in the making: GPT-4 vision kicked off the new era of vision LLMs at OpenAI DevDay last November and I’ve been watching the space with keen interest ever since.

I had a couple of false starts at the feature, which was difficult at first because LLM acts as a cross-model abstraction layer, and it’s hard to design those effectively without plenty of examples of different models.