Apache Spark Transformations and Actions — Lazy Evaluation — PySpark

Christopher Chung
Data Engineering Lab
4 min readFeb 3, 2024

--

Apache Spark, a potent force in the big data realm, empowers you to harness vast datasets with its distributed processing framework. One crucial aspect of using Spark effectively is understanding the distinction between transformations and actions. In this comprehensive guide, we’ll unveil the mysteries of these concepts, equip you with code examples, and guide you towards…

--

--

Christopher Chung
Data Engineering Lab

Data Engineering | Management | Governance | Strategy | Leadership | Culture https://topmate.io/chris_chung