Analyzing Spotify stream history

submited by
Style Pass
2024-02-12 21:30:10

I recently learned Spotify provides downloads for users’ streaming history. For me, this is over 10 years worth of data, so at the very least it seemed like a good nostalgia trip. This post covers some of my analysis of my personal export, and hopefully is a good starting place if you’re interested in exploring your own.

The linked data is one big zip file comprising of many JSON formatted events. There’s also “ReadMeFirst” PDF with file format overviews in an impressive number of languages (“Merci pour votre patience pendant que nous collections vos données.”), but I found the JSON object keys to be sufficiently self-descriptive.

I used Jupyter, Pandas, and Matplotlib for this analysis. First, let’s read in the data and see what we’re working with.

My data contained over 120,000 events and 217 days worth of streaming time. The individual records contain a lot of what you’d expect. The track name, artist, album, and how much time the song was played for. There’s also some odd ones like what IP address I used to connect to Spotify, but maybe we’ll look at that another day.

Leave a Comment