A curated list of influential whitepapers in the field of data engineering. [[Data Lakehouse]]: Lakehouse: A New Generation of Open Platforms that Uni

Data Engineering Whitepapers

submited by
Style Pass
2024-10-14 08:00:06
A curated list of influential whitepapers in the field of data engineering. [[Data Lakehouse]]: Lakehouse: A New Generation of Open Platforms that Unify Data Warehousing and Advanced Analytics ^6a7f75 [[Data Catalog]]: Ground: A Data Context Service [[Apache Spark]]: Spark: Cluster Computing with Working Sets [[Data Engineering Architecture]]: The Google File System [[Streaming]]: The Dataflow Model: A Practical Approach to Balancing Correctness, Latency, and Cost in Massive-Scale, Unbounded, Out-of-Order Data Processing [[Google File System (GFS)]]: The Google File System ^fdbf43 [[MapReduce]]: MapReduce: Simplified Data Processing on Large Clusters [[Data Warehouse|Data Warehousing]]: Dremel: Interactive Analysis of Web-Scale Datasets [[Data Mesh]]: How to Move Beyond a Monolithic Data Lake to a Distributed Data Mesh [[DuckDB]]: MotherDuck: DuckDB in the cloud and in the client- A paper that introduces the [[1-5-Tier Architecture]].
Leave a Comment