Preprocessing for Machine Learning in Python
James Chapman
Curriculum Manager, DataCamp
print(volunteer.info())
<class 'pandas.core.frame.DataFrame'>
RangeIndex: 665 entries, 0 to 664
Data columns (total 35 columns):
# Column Non-Null Count Dtype
-- ------ -------------- -----
0 opportunity_id 665 non-null int64
1 content_id 665 non-null int64
2 vol_requests 665 non-null int64
3 event_time 665 non-null int64
4 title 665 non-null object
.. ... ... ...
34 NTA 0 non-null float64
dtypes: float64(13), int64(8), object(14)
memory usage: 182.0+ KB
object
: string/mixed typesint64
: integerfloat64
: floatdatetime64
: dates and timesprint(df)
A B C
0 1 string 1.0
1 2 string2 2.0
2 3 string3 3.0
print(df.info())
<class 'pandas.core.frame.DataFrame'>
RangeIndex: 3 entries, 0 to 2
Data columns (total 3 columns):
# Column Non-Null Count Dtype
-- ------ -------------- -----
0 A 3 non-null int64
1 B 3 non-null object
2 C 3 non-null object
dtypes: int64(1), object(2)
memory usage: 200.0+ bytes
print(df)
A B C
0 1 string 1.0
1 2 string2 2.0
2 3 string3 3.0
df["C"] = df["C"].astype("float")
print(df.dtypes)
A int64
B object
C float64
dtype: object
Preprocessing for Machine Learning in Python