How fast is Iceberg on Snowflake?

submited by
Style Pass
2024-10-31 14:00:04

Iceberg is an open-source data storage format that has been gaining support among data platforms. As of recently, reading from Iceberg is at least partially supported by all of the major data warehouses, and new usability features and performance improvements are shipping every quarter. 

Iceberg and other open table formats promise a number of benefits around things like multi-engine querying, data sharing, and simplified data pipelines – but querying from Parquet files on S3 existed long before Iceberg – is Iceberg actually better?

We wanted to explore exactly how Iceberg storage compares to external storage when queried from one of the leading data warehouses, so we ran the test. Here are the results:

We used DuckDB to generate a TPC-DS dataset of 100GB (scale factor: 100). This dataset was then mounted or loaded to Snowflake in 4 configurations, with no sorting or clustering:

We created a new Snowflake warehouse (size: small) and ran the 99 queries in the TCP-DS benchmark against the four datasets. We ensured all data & querying was colocated in us-east-1, and we disabled all caching.

Leave a Comment