Analyzing US Census Data in Python
Lee Hachadoorian
Asst. Professor of Instruction, Temple University
Full count of core demographic characteristics:
Sample of extensive social and economic characteristics:
Variable | Label
-----------|--------------------------------------
B25045001 | Total
B25045002 | Owner Occupied
B25045003 | No Vehicle Available
B25045004 | Householder 15 to 34 Years
B25045005 | Householder 35 to 64 Years
B25045006 | Householder 65 Years and Over
B25045007 | 1 or More Vehicles Available
B25045008 | Householder 15 to 34 Years
B25045009 | Householder 35 to 64 Years
B25045010 | Householder 65 Years and Over
B25045011 | Renter Occupied
B25045012 | No Vehicle Available
B25045013 | Householder 15 to 34 Years
B25045014 | Householder 35 to 64 Years
B25045015 | Householder 65 Years and Over
B25045016 | 1 or More Vehicles Available
B25045017 | Householder 15 to 34 Years
B25045018 | Householder 35 to 64 Years
B25045019 | Householder 65 Years and Over
import requests import pandas as pd HOST, dataset = "https://api.census.gov/data", "acs/acs1"
get_vars = ["B25045_" + str(i + 1).zfill(3) + "E" for i in range(19)] get_vars = ["NAME"] + get_vars
print(get_vars)
['NAME', 'B25045_001E', 'B25045_002E', 'B25045_003E', 'B25045_004E',
'B25045_005E', 'B25045_006E', 'B25045_007E', 'B25045_008E', 'B25045_009E',
'B25045_010E', 'B25045_011E', 'B25045_012E', 'B25045_013E', 'B25045_014E',
'B25045_015E', 'B25045_016E', 'B25045_017E', 'B25045_018E', 'B25045_019E']
import requests import pandas as pd HOST, dataset = "https://api.census.gov/data", "acs/acs1"
get_vars = ["B25045_" + str(i + 1).zfill(3) + "E" for i in range(19)] get_vars = ["NAME"] + get_vars
# print(get_vars)
predicates = {} predicates["get"] = ",".join(get_vars) predicates["for"] = "us:*"
# Initialize DataFrame collector dfs = []
for year in range(2011, 2018):
base_url = "/".join([HOST, str(year), dataset]) r = requests.get(base_url, params=predicates)
df = pd.DataFrame(columns=r.json()[0], data=r.json()[1:])
# Add column to hold year value df["year"] = year
dfs.append(df)
# Concatenate all DataFrames in collector us = pd.concat(dfs, ignore_index=True)
print(us.head())
NAME B25045_001E B25045_002E ... B25045_019E us year
0 United States 114991725 74264435 ... 3232812 1 2011
1 United States 115969540 74119256 ... 3447172 1 2012
2 United States 116291033 73843861 ... 3662322 1 2013
3 United States 117259427 73991995 ... 3847400 1 2014
4 United States 118208250 74506512 ... 4044430 1 2015
[5 rows x 22 columns]
Analyzing US Census Data in Python