Feature Engineering for Machine Learning in Python
Robert O'Callaghan
Director of Data Science, Ordergroove
print(df.info())
<class 'pandas.core.frame.DataFrame'>
RangeIndex: 999 entries, 0 to 998
Data columns (total 12 columns):
# Column Non-Null Count Dtype
-- ------ -------------- -----
0 SurveyDate 999 non-null object
... ... ... ...
8 StackOverflowJobsRecommend 487 non-null float64
9 VersionControl 999 non-null object
10 Gender 693 non-null object
11 RawSalary 665 non-null object
dtypes: float64(2), int64(2), object(8)
memory usage: 93.7+ KB
print(df.isnull())
StackOverflowJobsRecommend VersionControl ... \
0 True False ...
1 False False ...
2 False False ...
3 True False ...
4 False False ...
Gender RawSalary
0 False True
1 False False
2 True True
3 False False
4 False False
print(df['StackOverflowJobsRecommend'].isnull().sum())
512
print(df.notnull())
StackOverflowJobsRecommend VersionControl ... \
0 False True ...
1 True True ...
2 True True ...
3 False True ...
4 True True ...
Gender RawSalary
0 True False
1 True True
2 False False
3 True True
4 True True
Feature Engineering for Machine Learning in Python