Introduction

Visualizing Geospatial Data in Python

Mary van Valkenburg

Data Science Program Manager, Nashville Software School

Location

  • 1854 cholera outbreak in London
  • 600+ deaths

miasma cloud

Visualizing Geospatial Data in Python

Snow's dot map

Visualizing Geospatial Data in Python

What you will learn in this course

  • How to plot geospatial points as scatterplots
  • How to plot geometries using geopandas
  • How to construct a GeoDataFrame from a DataFrame
  • How to spatially join datasets
  • How to add a street map to your plots
  • When and how to create a choropleth
Visualizing Geospatial Data in Python

Longitude and latitude

father and son heights scatterplot

earth with longitude and latitude grid lines

Visualizing Geospatial Data in Python
plt.scatter(schools.Longitude, 
            schools.Latitude, 
            c = 'darkgreen', 
            marker = 'p')
plt.show()

plain scatterplot of school locations

plt.scatter(schools.Longitude, schools.Latitude, 
            c = 'darkgreen', marker = 'p')
plt.xlabel('Longitude')
plt.ylabel('Latitude')
plt.title('Nashville Public Schools')
plt.grid()
plt.show()

school locations scatterplot with labels, title, gridlines

Visualizing Geospatial Data in Python

Extracting longitude and latitude

bus_stops.head()
Stop ID    StopName       Location
4431      MCC5_11    (36.16659, -86.781996)
588       CHA7AWN    (36.165, -86.78406)
590       CHA8AWN    (36.164393, -86.785451)
541       CXONGULC   (36.162249, -86.790464)
5231      7AVUNISM   (36.163822, -86.783791)

Visualizing Geospatial Data in Python

Extracting longitude and latitude

bus_stops['lat'] = [loc[0] for loc in bus_stops.Location]
bus_stops['lng'] = [loc[1] for loc in bus_stops.Location]
bus_stops.head()
Stop ID    StopName  Location                  lat         lng
4431      MCC5_11   (36.16659, -86.781996)    36.16659    -86.781996    
588       CHA7AWN  (36.165, -86.78406)        36.165      -86.78406
590       CHA8AWN  (36.164393, -86.785451)    36.164393   -86.785451
541       CXONGULC  (36.162249, -86.790464)   36.162249   -86.790464
5231      7AVUNISM  (36.163822, -86.783791)   36.163822   -86.783791

Visualizing Geospatial Data in Python

Extracting lng and lat with regular expressions

bus_stops2.head()
Stop ID        Location            
4431       MCC - BAY 11\nNashville, TN\n(36.16659, -86.78199)
588        CHARLOTTE AVE\nNashville, TN\n(36.165, -86.78406)
590        CHARLOTTE AV\nNashville, TN\n(36.164393, -86.785451)
541        CHARLOTTE\nNashville, TN\n(36.162249, -86.790464)
5231       Nashville, TN\n(36.163822, -86.783791)
Visualizing Geospatial Data in Python

Extracting lng and lat with regular expressions

lat_lng_pattern = re.compile(r'\((.*),\s*(.*)\)', flags=re.MULTILINE)
def extract_lat_lng(address):
    try:
        lat_lng_match = lat_lng_pattern.search(address)
        lat = float(lat_lng_match.group(1))
        lng = float(lat_lng_match.group(2))
        return (lat, lng)
    except:
        return (np.NaN, np.NaN)
lat_lngs = [extract_lat_lng(location)for location in \
           bus_stops2.loc[:, 'Location']]
bus_stops2['lat'] = [lat for lat, lng in lat_lngs]
bus_stops2['lng'] = [lng for lat, lng in lat_lngs]

Visualizing Geospatial Data in Python

Nashville open data

screenshot of nashville.data.gov

giant chicken head

Visualizing Geospatial Data in Python

Let's practice!

Visualizing Geospatial Data in Python

Preparing Video For Download...