DuckDB over Pandas/Polars

submited by
Style Pass
2024-11-01 19:00:17

November 1, 2024 1 minute read

Recently, I wanted to analyze and visualize some financial CSVs, including joining a few files together. I started out with Polars (which I understood to be a newer/better Pandas). However, as someone who doesn’t use it frequently, I found the syntax confusing and cumbersome.

For example, here is how I parsed a Transactions.csv and summed entries by Category for rows in 2024 (simplified example, code formatted with Black):

I’m sure this is straightforward for someone who uses these tools frequently. However, that’s not me. I play around for a bit and then come back to it weeks or months later and have to relearn.

In contrast, I write SQL day in and day out, so I find it much easier. Once I switched to DuckDB, I could write much more familiar (to me) SQL, while still using python for the rest of the code:

Leave a Comment