At midnight on September 16th, Jason Reid (data engineering advocate at Databricks) messages me on LinkedIn saying, “Tabular will be sunsetted in 24 hours. I hope you have migrated.” My lazy ass had not.
Panic immediately set in. I had 13,000 tables, 2,200 schemas and 3 terabytes of data managed by Tabular that my students had generated over the last two years that I needed to move.
I had built on top of Tabular for the last 2 years teaching thousands of students and making over $3 million. Saying good bye to Tabular was like saying good bye to a best friend.
I use AI to make this migration that felt insurmountable to be done in less than half a day work!
How I came up with a last-minute migration path
After I realized I had very little time, I needed to first rip out any information that the Tabular REST API might be storing that isn’t in S3. This would allow me to migrate at my own pace after it was sunset.
I looked at some of the file paths like s3://<s3 bucket>/ce557692-2f28-41e8-8250-8608042d2acb/000bde54-fb46-4e1a-a41d-962ea620ea8e/metadata/00000-37890644-fb40-4392-a395-9556ef865858.gz.metadata.json
I realized that Iceberg doesn’t store the table name in the file path or in the metadata files (which makes sense because it is mutable and Iceberg supports schema evolution).
So I realized I was going to need to play around with the Tabular API to map “table name to Iceberg metadata files”
First I made a REST call to https://api.tabular.io/ws/v1/ice/warehouses/<warehouse>/namespaces/
This made me realize that we have 2200 namespaces (for each student who had created a new schema!
Then for each namespace, I hit this endpoint
https://api.tabular.io/ws/v1/ice/warehouses/<warehouse>/namespaces/<namespace>/tables
This gives me a list of tables
Then for each table, I hit this endpoint
https://api.tabular.io/ws/v1/ice/warehouses/<warehouse>/namespaces/<namespace>/tables/<table>
This final endpoint gave me the S3 location of the table, the column data types and more. Then I created a hash map mapping of table name to metadata.
This was then dumped to a giant config.json
file. (This file was 73 MBs!)
Once I saw that file, I realized I could breathe a sigh of relief because I no longer depended on the Tabular API and they could sunset and I wouldn’t lose data!
The next step was deciding, which catalog should I work with?
Deciding on which catalog to choose
Databricks bought Tabular for over a billion dollars last summer. So it made sense that the first choice I should pick was moving to Unity Catalog.
I came across some challenges with Unity Catalog. Most notably that Starburst doesn’t currently support it. I use Starburst to teach Trino and Iceberg and it’s part of my web query page.
So then the choices seemed to be between:
Polaris Catalog
Glue Catalog
I just wanted to finish the migration quickly so I didn’t really do research into Polaris. I’m already very familiar with Glue catalog as I’ve taught it many times in my boot camp so it seemed like a more natural fit.
I begged Claude Code to give me the bash script to migrate the giant config.json file to register all the tables into Glue Catalog. Here’s the script if you’re curious.
I ran that piece of shit and it worked better than I thought. It took about 20 minutes to re-register all the tables into Glue. (without moving a single data file!)
Finishing up the migration
After all the data was in Glue and I was able to query it in Starburst, I breathed a huge a sigh of relief. My application directly queried the Tabular REST API so students can see all their tables. So I had to migrate that off to Glue.
Claude Code saved my bacon here again. I asked it, “migrate all the Tabular endpoints to use AWS Glue catalog in us-west-2”
It one shotted this full-stack migration. We are so cooked but I’m glad as a single human I was able to do this in 4 hours and not disrupt my students learning experience even for a single second!
AI is the future! Let’s keep building together! If you want to learn how to use AI to super charge your data engineering career like this, you can join my AI boot camp that starts on October 20th. You can use code IACCEPT for 30% off!