Can DuckDB do everything that a million dollar database can do?

submited by
Style Pass
2025-01-03 09:30:18

KX published an article on medium (18th December 2024) demonstrating stock analysis using kdb+. Can we load the same NYSE TAQ data and perform the same analysis using a free database? How will it perform? Let's try it with DuckDB.

This was a simple query that even I was able to write in 1 minute. The great news is that the result is the exact same as that produced by kdb+.

The bad news is that it's significantly slower. 500ms is fast enough that it's not interrupting my workflow but it's not close to the 113 milliseconds reported for kdb+. I tried adjusting the CPUs used and running in-memory but performance stayed similar to initial settings or got worse.

The ttime column that I created earlier is only Time, i.e. it is NOT a timestamp or a datetime. Time seems to have less functions available than timestamp in DuckDB so rounding it to the nearest time bucket is more tricky but this did give me a chance to create a cool xbarTime function. See our time-bucketing article for more examples.

The weighted average is similar to an ordinary arithmetic mean (the most common type of average), except that instead of each of the data points contributing equally to the final average, some data points contribute more than others. In finance it is common to weight trades by the number of trades executed at a price point, to find a weighted average of all prices.

Leave a Comment