Introduction to GeoPandas

Working with Geospatial Data in Python

Joris Van den Bossche

Open source software developer and teacher, GeoPandas maintainer

Spatial specific data formats

restaurants = pd.read_csv("datasets/paris_restaurants.csv")
restaurants.head()
                                             type         x          y
0                             Restaurant européen  259641.6  6251867.4
1                Restaurant traditionnel français  259572.3  6252030.2
2                Restaurant traditionnel français  259657.2  6252143.8
3  Restaurant indien, pakistanais et Moyen Orient  259684.4  6252203.6
4                Restaurant traditionnel français  259597.9  6252230.0

In the rest of the course:

  • spatial file formats (Shapefiles, GeoJSON, GeoPackage, ...)
  • GeoPandas: pandas dataframes with support for spatial data
Working with Geospatial Data in Python

Importing geospatial data with GeoPandas

import geopandas
countries = geopandas.read_file("countries.geojson")
countries.head()
          name      continent       gdp                            geometry
0  Afghanistan           Asia   64080.0  POLYGON ((61.21 35.65, 62.23 35...
1       Angola         Africa  189000.0  MULTIPOLYGON (((23.90 -11.72, 2...
2      Albania         Europe   33900.0  POLYGON ((21.02 40.84, 21.00 40...
4    Argentina  South America  879400.0  MULTIPOLYGON (((-66.96 -54.90, ...
5      Armenia           Asia   26300.0  POLYGON ((43.58 41.09, 44.97 41...
Working with Geospatial Data in Python

Quickly visualizing spatial data with GeoPandas

countries.plot()

Working with Geospatial Data in Python

The GeoDataFrame

countries.head()
          name      continent       gdp                            geometry
0  Afghanistan           Asia   64080.0  POLYGON ((61.21 35.65, 62.23 35...
1       Angola         Africa  189000.0  MULTIPOLYGON (((23.90 -11.72, 2...
2      Albania         Europe   33900.0  POLYGON ((21.02 40.84, 21.00 40...
...
type(countries)
geopandas.geodataframe.GeoDataFrame
Working with Geospatial Data in Python

The GeoDataFrame

countries.head()
          name      continent       gdp                            geometry
0  Afghanistan           Asia   64080.0  POLYGON ((61.21 35.65, 62.23 35...
1       Angola         Africa  189000.0  MULTIPOLYGON (((23.90 -11.72, 2...
2      Albania         Europe   33900.0  POLYGON ((21.02 40.84, 21.00 40...
...

A GeoDataFrame represents a tabular, geospatial vector dataset:

  • a 'geometry' column: that holds the geometry information
  • other columns: attributes describe each of the geometries
Working with Geospatial Data in Python

The 'geometry' attribute

countries.geometry
0      POLYGON ((61.21 35.65, 62.23 35...
1      MULTIPOLYGON (((23.90 -11.72, 2...
                      ...                
175    POLYGON ((23.22 -17.52, 22.56 -...
176    POLYGON ((29.43 -22.09, 28.79 -...
Name: geometry, Length: 176, dtype: object
type(countries.geometry)
geopandas.geoseries.GeoSeries
Working with Geospatial Data in Python

Spatial aware DataFrame

countries.geometry.area
0       63.593500
1      103.599439
2        3.185163
          ...    
174    112.718524
175     62.789498
176     32.280371
Length: 177, dtype: float64
Working with Geospatial Data in Python

Summary

A GeoDataFrame is like a pandas DataFrame:

  • all features of normal pandas DataFrames still work

but supercharged with spatial functionality:

  • plot() method
  • geometry attribute (GeoSeries)
  • spatial-specific attributes and methods (e.g. area)
Working with Geospatial Data in Python

Let's practice!

Working with Geospatial Data in Python

Preparing Video For Download...