5 Comments

I am obviously biased but the idea of having hundreds of data platform engineers all building the same data platforms / data frameworks seems incredibly wasteful, albeit it's definitely interesting work.

The ELT vendors saw an opportunity when they realised there were thousands of developers all writing exactly the same code to move data from salesforce to a data warehouse. It is preferable to have one person write that code really well and other people use it (be it via a vendor or something open source) and I would argue you can make the same argument towards frameworks (there will always be outliers)

Expand full comment

This! It feels like we are getting close to the tipping point. I wonder if the final push it needs is going all in on data products as the raison d'être and data owners as the foundation. It's not about building tubes but having a system that allows domain experts to make their own assets/sets/quantums/thingies. E & T & L will continue to happen but it's an implementation detail not the reason.

Also, "were able to move on to more important work, like creating flashcards, founding startups, and getting into fights on Twitter." ❤️

Expand full comment

First, I really enjoyed reading through the history on how Data Engineering has evolved. Being lived this journey, it felt nostalgic to walk down that lane again! Second, being software engineers at core, the urge to build "a product/framework" that many others can use never dies!

Expand full comment

The need for data engineering has always been motivated by the fact that transactional data is spread far and wide (many systems), often inconsistent, volatile and optimized for operational concerns, while data for analytical purpose is required to be consistent, stable and optimized for analytical concerns (aggregation and statistics). As this is unlikely to change, data engineering will not go away. We might call it something else, yet the core tasks - extraction, transformation and making available - are bound to stay relevant. The dream that domain owners aka "business" will write their own pipelines is as old as the field. Many attempts have been taken, they always revert to being too technical and thus are delegated to technically skilled personnel.

Expand full comment

Great post, particularly astute commentary: "Data Engineers are coming back to the original sin of Data Engineering, building bespoke custom pipelines for your downstream consumers, and they’re solving it the same way we were trying to solve it 10 years ago: building platforms, frameworks, and services... The next evolution of the role is more akin to a Data Platform Engineer. This is someone who is tasked not with building ETL pipelines, but with making it possible for their various consumers to build any pipeline they need without having to resort to a complex higher language."

Expand full comment