Jason Bell
Feb 17, 2022

--

A few observations:

Most "startup" data problems are easy enough to be done in SQL with a well tried and tested database like MySQL or Postgres.

To start with "Modern ETL" pipelines are expensive to run once you get in to a distributed nodes it gets very very expensive. So the real question to ask is, what is the actual problem you are trying to solve?

Also, no mention of Kafka, streaming data ETL is far more in demand at the moment with the likes of Kafka Connect and KSQLDB for source/sink data and transforms and aggregation.

--

--

Jason Bell
Jason Bell

Written by Jason Bell

The Startup Quant and founder of ATXGV: Author of two machine learning books for Wiley Inc.

No responses yet