DataFrames and their methods

Python for Spreadsheet Users

Chris Cardillo

Data Scientist

Where we left off

import pandas as pd

fruit = pd.read_excel('fruit.xlsx')

print(fruit)

simple fruit whole dataset.png

Python for Spreadsheet Users

Anatomy of a pandas DataFrame

simple fruit whole dataset.png

Python for Spreadsheet Users

Anatomy of a pandas DataFrame

simple fruit df cols.png

Python for Spreadsheet Users

Anatomy of a pandas DataFrame

simple fruit df numerical column.png

Python for Spreadsheet Users

Anatomy of a pandas DataFrame

simple fruit df object-character column.png

Python for Spreadsheet Users

Anatomy of a pandas DataFrame

simple fruit df rows 1.png

Python for Spreadsheet Users

Anatomy of a pandas DataFrame

simple fruit df rows 2.png

Python for Spreadsheet Users

Anatomy of a pandas DataFrame

simple fruit df index.png

Python for Spreadsheet Users

DataFrame methods

  • .head()
  • .info()
  • .describe()
  • .sort_values()
Python for Spreadsheet Users

The .head() method

import pandas as pd

fruit = pd.read_excel('fruit.xlsx')

print(fruit.head())

simple fruit df head.png

Python for Spreadsheet Users

The .head() method

import pandas as pd

fruit = pd.read_excel('fruit.xlsx')

print(fruit.head(2))

simple fruit df head with arg.png

Python for Spreadsheet Users

The .info() method

import pandas as pd

fruit = pd.read_excel('fruit.xlsx')

print(fruit.info())
<class 'pandas.core.frame.DataFrame'>
RangeIndex: 8 entries, 0 to 7
Data columns (total 3 columns):
 #   Column     Non-Null Count  Dtype  
 --  ------     --------------  -----  
 0   name       8 non-null      object 
 1   color      8 non-null      object 
 2   price_usd  8 non-null      float64
dtypes: float64(1), object(2)
memory usage: 272.0+ bytes
Python for Spreadsheet Users

The .describe() method

import pandas as pd

fruit = pd.read_excel('fruit.xlsx')

print(fruit.describe())

simple fruit df describe method.png

Python for Spreadsheet Users

The .sort_values() method

import pandas as pd

fruit = pd.read_excel('fruit.xlsx')

fruit = fruit.sort_values('name')
fruit = fruit.reset_index(drop=True)

print(fruit)

simple fruit df sorted by name asc.png

Python for Spreadsheet Users

The .sort_values() method

import pandas as pd

fruit = pd.read_excel('fruit.xlsx')

fruit = fruit.sort_values('price_usd', ascending=False)
fruit = fruit.reset_index(drop=True)

print(fruit.head(3))

simple fruit df sorted by price desc head 3.png

Python for Spreadsheet Users

Your turn!

Python for Spreadsheet Users

Preparing Video For Download...