6 Comments
User's avatar
Reeves Smith's avatar

Great benchmarks here. The real lesson we see repeatedly: performance comes less from which platform and more from how thoughtfully data is structured and queried.

For most organizations, modern platforms already deliver far more performance than 99% of workloads actually require, but the real gap is usually design, not technology.

Expand full comment
Eu's avatar

Great article!

Expand full comment
Han's avatar

Thanks for the clear writing and sample code.

I have some questions hoping you could answer on experiment consistency across environments, and whether these variances are important to fix?

Some differences:

1. The local method used parquet files while the Motherduck test used duckdb databases.

I'm not sure if it's faster to select from read_parquet('{wildcard file_path}') locally or from {db_name}.main.{table_name} in motherduck and what's the computer science principles behind this answer.

2. Why not use duckdb databases locally too? I assume this makes the test fairer? Or is this slower than parquet files when run locally, and the aim of the experiment is to find the fastest method in each environment even if they may be inconsistent?

3. Has the decision to save into files locally got something to do with local duckdb being unable to write concurrently into a database with ProcessPoolExecutor?

Other questions unrelated to experiment consistency

1. Why cast(uuid() as varchar(30))? If i understand right, uuid() is 36 characters. Doesn't truncating it cause non-uniqueness which defeats its purpose?

2. Why no accounting of cold start in local test?

3. In unsorted motherduck, why seed with view instead of table? That view is not used anymore after the table doubling iterations begin.

Expand full comment
Slawomir Piotr's avatar

MotherDuck reminds me of Snowflake.

Expand full comment
Fritts's avatar

How would this get exposed to BI tool? JDBC within PowerBI? I’m curious the cost comparison of running a direct query PowerBI model to DuckDB vs Databricks 80 dbu cluster running 10 billion row fact table 80 columns wide. I think I’m going to experiment.

Expand full comment
Swati Popuri's avatar

The duck walked up to the lemonade stand and said to the man - Hey! do you have any grapes?

Expand full comment