Data Engineering in Nutshell

Engineering at Trillion-Row Scale: A Deep Dive into Uber’s Hudi-Powered Data Lake

April 5, 2026April 5, 2026 by Karthikeyan S

In the world of data engineering, “scale” is often a relative term. But at Uber, scale means managing a multi-hundred-petabyte repository that handles 6 trillion rows ingested daily. To manage this tidal wave of information, Uber moved away from traditional append-only data lakes to create Apache Hudi™, a storage engine that brings database-like primitives to … Read more

Python Topics for Data Engineers: Essential Skills You Must Learn

March 5, 2026March 5, 2026 by Karthikeyan S

Python has become one of the most important programming languages in modern data engineering. From building data pipelines to processing large datasets and integrating APIs, Python plays a critical role in the daily workflow of a data engineer If you’re planning to build a career in data engineering, understanding the essential Python topics for data … Read more