Stack Overflow, the place where most of your production code comes from, publicly exports their data every couple/few months. @TarynPivots (their DBA

Download the Current Stack Overflow Database for Free (2021-06)

submited by
Style Pass
2021-06-11 17:00:02

Stack Overflow, the place where most of your production code comes from, publicly exports their data every couple/few months. @TarynPivots (their DBA) tweets about it, and then I pull some levers and import the XML data dump into SQL Server format.

Stack Overflow’s database makes for great blog post examples because it’s real-world data: real data distributions, lots of different data types, easy to understand tables, simple joins. Some of the tables include:

This isn’t the exact same data structure as Stack Overflow’s current database – they’ve changed their own database over the years, but they still provide the data dump in the same style as the original site’s database, so your demo queries still work over time. If you’d like to find demo queries or find inspiration on queries to write, check out Data.StackExchange.com, a public query repository.

New this month: I built it with page-level database compression, which requires SQL Server 2016 Service Pack 1 or newer (but doesn’t require Enterprise Edition.) I don’t have a before-and-after across all of the tables, but the Badges table was 2GB before, and 0.5GB afterwards. Woohoo! Every little bit helps, especially with the database size.

Leave a Comment