I’m starting a new interview series called Project. The idea is to interview people who are building interesting data projects and talk about what t

Simon Willison’s Weblog

submited by
Style Pass
2024-11-07 19:00:05

I’m starting a new interview series called Project. The idea is to interview people who are building interesting data projects and talk about what they’ve built, how they built it, and what they learned along the way.

The first episode is a conversation with Rajiv Sinclair from Public Data Works about VERDAD, a brand new project in collaboration with journalist Martina Guzmán that aims to track misinformation in radio broadcasts around the USA.

VERDAD hits a whole bunch of my interests at once. It’s a beautiful example of scrappy data journalism in action, and it attempts something that simply would not have been possible just a year ago by taking advantage of new LLM tools.

VERDAD tracks radio broadcasts from 48 different talk radio radio stations across the USA, primarily in Spanish. Audio from these stations is archived as MP3s, transcribed and then analyzed to identify potential examples of political misinformation.

The result is “snippets” of audio accompanied by the trancript, an English translation, categories indicating the type of misinformation that may be present and an LLM-generated explanation of why that snippet was selected.

Leave a Comment