Parallel Programming with Dask in Python
James Fulton
Climate Informatics Researcher
import xarray as xr ds = xr.open_zarr("data/era_eu.zarr")
print(ds)
<xarray.Dataset>
Dimensions: (lat: 30, lon: 45, time: 504)
Coordinates:
* lat (lat) float64 35.5 36.5 37.5 38.5 39.5 ... 60.5 61.5 62.5 63.5 64.5
* lon (lon) float64 -14.5 -13.5 -12.5 -11.5 -10.5 ... 26.5 27.5 28.5 29.5
* time (time) datetime64[ns] 1979-05-31 1979-06-30 ... 2021-04-30
Data variables:
precip (time, lat, lon) float32 dask.array<chunksize=(12, 15, 15), ... >
temp (time, lat, lon) float32 dask.array<chunksize=(12, 15, 15), ... >
# Select a particular date
df.loc['2020-01-01']
# Select by index number
df.iloc[0]
# Select column
df['column1']
# Select a particular date
ds.sel(time='2020-01-01')
# Select by index number
ds.isel(time=0)
# Select variable
ds['variable1']
# Perform mathematical operations df.mean()
# Groupby and mean df.groupby(df['time'].dt.year).mean()
# Rolling mean rolling_mean = df.rolling(5).mean()
# Perform mathematical operations ds.mean() ds.mean(dim='dim1') ds.mean(dim=('dim1', 'dim2'))
# Groupby and mean ds.groupby(ds['time'].dt.year).mean()
# Rolling mean rolling_mean = ds.rolling(dim1=5).mean()
rolling_mean.compute()
ds['variable'].plot()
Example
ds['variable'].plot()
Example
ds['variable'].plot()
Example
Parallel Programming with Dask in Python