Introduction to APIs

Streamlined Data Ingestion with pandas

Amany Mahfouz

Instructor

Application Programming Interfaces

  • Defines how a application communicates with other programs
  • Way to get data from an application without knowing database details

A form with some entries checked and some blank

Streamlined Data Ingestion with pandas

Application Programming Interfaces

  • Defines how a application communicates with other programs
  • Way to get data from an application without knowing database details

A form and a storefront. An arrow indicates that the form is being sent to the store.

Streamlined Data Ingestion with pandas

Application Programming Interfaces

  • Defines how a application communicates with other programs
  • Way to get data from an application without knowing database details

An open package and a storefront. The package is in the place the order form used to be. An arrow indicates that the store has sent the package.

Streamlined Data Ingestion with pandas

Requests

  • Send and get data from websites
  • Not tied to a particular API
  • requests.get() to get data from a URL

requests logo

Streamlined Data Ingestion with pandas

requests.get()

  • requests.get(url_string) to get data from a URL
  • Keyword arguments
    • params keyword: takes a dictionary of parameters and values to customize API request
    • headers keyword: takes a dictionary, can be used to provide user authentication to API
  • Result: a response object, containing data and metadata
    • response.json() will return just the JSON data
Streamlined Data Ingestion with pandas

response.json() and pandas

  • response.json() returns a dictionary
  • read_json() expects strings, not dictionaries
  • Load the response JSON to a dataframe with pd.DataFrame()
    • read_json() will give an error!
Streamlined Data Ingestion with pandas

Yelp Business Search API

Yelp logo

Streamlined Data Ingestion with pandas

Yelp Business Search API

Yelp Business Search API documentation. Request endpoint and some parameters are shown.

Streamlined Data Ingestion with pandas

Yelp Business Search API

Yelp Business Search API documentation. API endpoint URL is highlighted.

Streamlined Data Ingestion with pandas

Yelp Business Search API

Yelp Business Search API documentation. An optional term parameter is highlighted.

Streamlined Data Ingestion with pandas

Yelp Business Search API

Yelp Business Search API documentation. Required locational parameters are highlighted.

Streamlined Data Ingestion with pandas

Yelp Business Search API

Yelp Business Search API documentation, showing an example response JSON

Streamlined Data Ingestion with pandas

Making Requests

import requests
import pandas as pd

api_url = "https://api.yelp.com/v3/businesses/search"
# Set up parameter dictionary according to documentation params = {"term": "bookstore", "location": "San Francisco"}
# Set up header dictionary w/ API key according to documentation headers = {"Authorization": "Bearer {}".format(api_key)}
# Call the API response = requests.get(api_url, params=params, headers=headers)
Streamlined Data Ingestion with pandas

Parsing Responses

# Isolate the JSON data from the response object
data = response.json()
print(data)
{'businesses': [{'id': '_rbF2ooLcMRA7Kh8neIr4g', 'alias': 'city-lights-bookstore-san-francisco', 'name': 'City Lights Bookstore', 'image_url': 'https://s3-media1.fl.yelpcdn.com/bphoto/VRydkkpVbA3CeVLBKzs2Vw/o.jpg', 'is_closed': False,
# Load businesses data to a dataframe
bookstores = pd.DataFrame(data["businesses"])
print(bookstores.head(2))
                                  alias        ...                                                    url
0   city-lights-bookstore-san-francisco        ...      https://www.yelp.com/biz/city-lights-bookstore...
1  alexander-book-company-san-francisco        ...      https://www.yelp.com/biz/alexander-book-compan...

[2 rows x 16 columns]
Streamlined Data Ingestion with pandas

Let's practice!

Streamlined Data Ingestion with pandas

Preparing Video For Download...