Introduction to other file types

Introduction to Importing Data in Python

Hugo Bowne-Anderson

Data Scientist at DataCamp

Other file types

  • Excel spreadsheets
  • MATLAB files
  • SAS files
  • Stata files
  • HDF5 files
Introduction to Importing Data in Python

Pickled files

  • File type native to Python
  • Motivation: many datatypes for which it isn’t obvious how to store them
  • Pickled files are serialized
  • Serialize = convert object to bytestream
Introduction to Importing Data in Python

Pickled files

import pickle
with open('pickled_fruit.pkl', 'rb') as file:
    data = pickle.load(file)    
print(data)
{'peaches': 13, 'apples': 4, 'oranges': 11}
Introduction to Importing Data in Python

Importing Excel spreadsheets

import pandas as pd
file = 'urbanpop.xlsx'
data = pd.ExcelFile(file)
print(data.sheet_names)
['1960-1966', '1967-1974', '1975-2011']
df1 = data.parse('1960-1966') # sheet name, as a string
df2 = data.parse(0) # sheet index, as a float
Introduction to Importing Data in Python

You’ll learn:

  • How to customize your import
  • Skip rows
  • Import certain columns
  • Change column names
Introduction to Importing Data in Python

Let's practice!

Introduction to Importing Data in Python

Preparing Video For Download...