TL;DR: DuckDB allows vertical stacking of datasets by column name rather than position. This allows DuckDB to read files with schemas that evolve over

Vertical Stacking as the Relational Model Intended: UNION ALL BY NAME

submited by
Style Pass
2025-01-10 18:00:08

TL;DR: DuckDB allows vertical stacking of datasets by column name rather than position. This allows DuckDB to read files with schemas that evolve over time and finally aligns SQL with Codd's relational model.

Ever heard of SQL's CORRESPONDING keyword? Yeah, me neither! Well, it has been in the SQL standard since at least 1992, and almost nobody implemented it! CORRESPONDING was an attempt to fix a flaw in SQL – but it failed. It's time for SQL to get back to the relational model's roots when stacking data. Let's wind the clocks back to 1969…

You just picked up your own Ford Mustang Boss 302, drifting around the corner at every street to make it to the library to read the latest research report out of IBM by Edgar Codd. (Do we need a Neflix special about databases?) Reading that report, wearing plenty of plaid, you gain a critical insight: data should be treated as unordered sets! (Technically multisets – duplicates are everywhere…) Rows should be treated as unordered and so should columns. The relational model is the way. Any language built atop the relational model should absolutely follow those core principles.

A few years later, you learn about SQL, and it looks like a pretty cool idea. Declarative, relational – none of this maintaining order business. You don't want to be tied down by an ordering, after all. What if you change your mind about how to query your data? Sets are the best way to think about these things.

Leave a Comment