Validating data types
<class 'pandas.core.frame.DataFrame'>
RangeIndex: 350 entries, 0 to 349
Data columns (total 5 columns):
 #   Column  Non-Null Count  Dtype  
--   ------  --------------  -----  
 0   name    350 non-null    object 
 1   author  350 non-null    object 
 2   rating  350 non-null    float64
 3   year    350 non-null    float64 
 4   genre   350 non-null    object 
dtypes: float64(1), int64(1), object(3)
memory usage: 13.8+ KB
name       object
author     object
rating    float64
year      float64
genre      object
dtype: object
Updating data types

books["year"] = books["year"].astype(int)

name       object
author     object
rating    float64
year        int64
genre      object
dtype: object
Updating data types

Type Python Name
String str
Integer int
Float float
Dictionary dict
List list
Boolean bool
Validating categorical data

books["genre"].isin(["Fiction", "Non Fiction"])
0       True
1       True
2       True
3       True
4      False
345     True
346     True
347     True
348     True
349    False
Name: genre, Length: 350, dtype: bool
Validating categorical data

~books["genre"].isin(["Fiction", "Non Fiction"])
0      False
1      False
2      False
3      False
4       True
345    False
346    False
347    False
348    False
349     True
Name: genre, Length: 350, dtype: bool
Validating categorical data

books[books["genre"].isin(["Fiction", "Non Fiction"])].head()
|   |                          name |              author | rating | year |       genre |
| 0 | 10-Day Green Smoothie Cleanse |            JJ Smith |    4.7 | 2016 | Non Fiction |
| 1 |             11/22/63: A Novel |        Stephen King |    4.6 | 2011 |     Fiction |
| 2 |             12 Rules for Life |  Jordan B. Peterson |    4.7 | 2018 | Non Fiction |
| 3 |        1984 (Signet Classics) |       George Orwell |    4.7 | 2017 |     Fiction |
| 5 |         A Dance with Dragons  | George R. R. Martin |    4.4 | 2011 |     Fiction |
Validating numerical data

|   | rating | year |
| 0 |    4.7 | 2016 |
| 1 |    4.6 | 2011 |
| 2 |    4.7 | 2018 |
| 3 |    4.7 | 2017 |
| 4 |    4.8 | 2019 |
Validating numerical data

sns.boxplot(data=books, x="year")

a boxplot of the publishing years for the books data

Validating numerical data

sns.boxplot(data=books, x="year", y="genre")

a boxplot of the books data, broken down by genre

