Introduction to Data Engineering
Vincent Vankrunkelsven
Data Engineer @ DataCamp



SELECT year, AVG(age)
FROM views.athlete_events
GROUP BY year


.map() or .filter().count() or .first()
# Load the dataset into athlete_events_spark first
(athlete_events_spark
.groupBy('Year')
.mean('Age')
.show())
SELECT year, AVG(age)
FROM views.athlete_events
GROUP BY year
Introduction to Data Engineering