Introduction au data engineering
Vincent Vankrunkelsven
Data Engineer @ DataCamp



SELECT year, AVG(age)
FROM views.athlete_events
GROUP BY year


.map() ou .filter().count() ou .first()
# Load the dataset into athlete_events_spark first
(athlete_events_spark
.groupBy('Year')
.mean('Age')
.show())
SELECT year, AVG(age)
FROM views.athlete_events
GROUP BY year
Introduction au data engineering