DataExpert.io Newsletter

DataExpert.io Newsletter

Processing 1 TB with DuckDB in less than 30 seconds

And so can you

Matt Martin's avatar
Zach Wilson's avatar
Matt Martin and Zach Wilson
Dec 23, 2025
∙ Paid

Get ready to toss out all the norms and conventional wisdom about distributed compute! Today, we are eradicating the belief that DuckDB can only be used for “small” data.

In this article, we will attack the following beliefs:

  • Only Spark can be used for terabytes of data (or it is ALWAYS the best choice)

  • You need a lot of time to process TBs of data

We want…

User's avatar

Continue reading this post for free, courtesy of Zach Wilson.

Or purchase a paid subscription.
© 2026 Zach Wilson · Privacy ∙ Terms ∙ Collection notice
Start your SubstackGet the app
Substack is the home for great culture