Using proper data types

Analyzing Police Activity with pandas

Kevin Markham

Founder, Data School

Examining the data types

ri.dtypes
stop_date             object
stop_time             object
driver_gender         object
...                      ...
stop_duration         object
drugs_related_stop      bool
district              object
  • object: Python strings (or other Python objects)
  • bool: True and False values
  • Other types: int, float, datetime, category
Analyzing Police Activity with pandas

Why do data types matter?

  • Affects which operations you can perform
  • Avoid storing data as strings (when possible)
    • int, float: enables mathematical operations
    • datetime: enables date-based attributes and methods
    • category: uses less memory and runs faster
    • bool: enables logical and mathematical operations
Analyzing Police Activity with pandas

Fixing a data type

apple
      date   time   price
0  2/13/18  16:00  164.34
1  2/14/18  16:00  167.37
2  2/15/18  16:00  172.99
apple.price.dtype
dtype('O')
apple['price'] = 
  apple.price.astype('float')
apple.price.dtype
dtype('float64')
  • Dot notation: apple.price
  • Bracket notation: apple['price']
Analyzing Police Activity with pandas

Let's practice!

Analyzing Police Activity with pandas

Preparing Video For Download...