Working with raster data

Working with Geospatial Data in Python

Joris Van den Bossche

Open source software developer and teacher, GeoPandas maintainer

Grid of values illustration

Image source: QGIS documentation

Working with Geospatial Data in Python

Raster data

Chance of rain map

Working with Geospatial Data in Python

Raster data with multiple bands

Working with Geospatial Data in Python

The rasterio package

import rasterio
  • "Pythonic" bindings to GDAL
  • Reading and writing raster files
  • Processing tools (masking, reprojection, resampling, ..)

https://rasterio.readthedocs.io/en/latest/

Working with Geospatial Data in Python

Opening a raster file

import rasterio

src = rasterio.open("DEM_world.tif")

Metadata:

src.count
1
src.width, src.height
(4320, 2160)
Working with Geospatial Data in Python

Raster data = numpy array

array = src.read()

Standard numpy array:

array
array([[[-4290, -4290, -4290, ..., -4290, -4290, -4290],
        [-4278, -4278, -4278, ..., -4278, -4278, -4278],
        [-4269, -4269, -4269, ..., -4269, -4269, -4269],
        ...,
        [ 2804,  2804,  2804, ...,  2804,  2804,  2804],
        [ 2804,  2804,  2804, ...,  2804,  2804,  2804],
        [ 2804,  2804,  2804, ...,  2804,  2804,  2804]]], dtype=int16)
Working with Geospatial Data in Python

Plotting a raster dataset

Using the rasterio.plot.show() method:

import rasterio.plot

rasterio.plot.show(src, cmap='terrain')

Working with Geospatial Data in Python

Extracting information based on vector data

rasterstats: Summary statistics of geospatial raster datasets based on vector geometries (https://github.com/perrygeo/python-rasterstats)

Working with Geospatial Data in Python

Extract raster values with rasterstats

  • For point vectors:

    rasterstats.point_query(geometries, "path/to/raster", 
                            interpolation='nearest'|'bilinear')
    
  • For polygon vectors:

    rasterstats.zonal_stats(geometries, "path/to/raster",
                            stats=['min', 'mean', 'max'])
    
Working with Geospatial Data in Python

Extract raster values with rasterstats

result = rasterstats.zonal_stats(countries.geometry, "DEM_gworld.tif", 
                                 stats=['mean'])

countries['mean_elevation'] = pd.DataFrame(result)
countries.sort_values('mean_elevation', ascending=False).head()
            name    continent                     geometry  mean_elevation
157   Tajikistan         Asia  POLYGON ((74.98 37.41, ...      3103.231105
85    Kyrgyzstan         Asia  POLYGON ((80.25 42.34, ...      2867.717142
24        Bhutan         Asia  POLYGON ((91.69 27.77, ...      2573.559846
119        Nepal         Asia  POLYGON ((81.11 30.18, ...      2408.907816
6     Antarctica   Antarctica  (POLYGON ((-59.57 -80.04...     2374.075028
..           ...          ...                          ...             ...
Working with Geospatial Data in Python

Let's practice!

Working with Geospatial Data in Python

Preparing Video For Download...