Extract Load Transform (ELT) is the core primitive of the modern data stack. ELT has only been made possible recently due to the availability of cheap object storage, like Amazon's S3. Unfortunately, ELT is a fundamentally sub-optimal primitive for the majority of data use cases making the creation of positive data outcomes an uphill battle for many data organizations.
I believe the answer is to treat data as a software problem in the same way that Google treats operations as a software problem, provisioning data systems instead of solving requests on behalf of individual stakeholders, ushering in an era of software-based data analytics instead of fundamentally human consulting based data offerings. Data needs to be a software problem instead of a human consulting problem.
We can look at software engineering for ways to generate high fidelity business insights, bypassing layers and layers of batch based data engineering modeling and aligning insight ownership with the producing teams. The industry truly needs to modernize with a focus on engineering and removing human data consultants from the loop to succeed and scale. Sustainable business insights don’t happen relying on other teams, they happen when data producers are empowered to be data consumers are empowered to create their own data to query their data generate insights and make data-driven product decisions independent of other teams. Extract-Load-Transform and the Data Lake