Data opschonen met PySpark
Mike Metzger
Data Engineering Consultant
Voorwaardelijke clausules zijn:
.when().otherwise().when(<if condition>, <then x>)
df.select(df.Name, df.Age, F.when(df.Age >= 18, "Adult"))
| Name | Age | |
|---|---|---|
| Alice | 14 | |
| Bob | 18 | Volwassene |
| Candice | 38 | Volwassene |
Meerdere .when()
df.select(df.Name, df.Age,
.when(df.Age >= 18, "Adult")
.when(df.Age < 18, "Minor"))
| Name | Age | |
|---|---|---|
| Alice | 14 | Minderjarig |
| Bob | 18 | Volwassene |
| Candice | 38 | Volwassene |
.otherwise() is zoals else
df.select(df.Name, df.Age,
.when(df.Age >= 18, "Adult")
.otherwise("Minor"))
| Name | Age | |
|---|---|---|
| Alice | 14 | Minderjarig |
| Bob | 18 | Volwassene |
| Candice | 38 | Volwassene |
Data opschonen met PySpark