Preprocessing for Machine Learning in Python
James Chapman
Curriculum Manager, DataCamp
print(temps)
city day1 day2 day3
0 NYC 68.3 67.9 67.8
1 SF 75.1 75.5 74.9
2 LA 80.3 84.0 81.3
3 Boston 63.0 61.0 61.2
temps["mean"] = temps.loc[:,"day1":"day3"].mean(axis=1)
print(temps)
city day1 day2 day3 mean
0 NYC 68.3 67.9 67.8 68.00
1 SF 75.1 75.5 74.9 75.17
2 LA 80.3 84.0 81.3 81.87
3 Boston 63.0 61.0 61.2 61.73
print(purchases)
date purchase
0 July 30 2011 $45.08
1 February 01 2011 $19.48
2 January 29 2011 $76.09
3 March 31 2012 $32.61
4 February 05 2011 $75.98
purchases["date_converted"] = pd.to_datetime(purchases["date"])
purchases['month'] = purchases["date_converted"].dt.month
print(purchases)
date purchase date_converted month
0 July 30 2011 $45.08 2011-07-30 7
1 February 01 2011 $19.48 2011-02-01 2
2 January 29 2011 $76.09 2011-01-29 1
3 March 31 2012 $32.61 2012-03-31 3
4 February 05 2011 $75.98 2011-02-05 2
Preprocessing for Machine Learning in Python