Exploratory Data Analysis in Python
Izzy Weber
Curriculum Manager, DataCamp
divorce = pd.read_csv("divorce.csv")
divorce.head()
marriage_date marriage_duration
0 2000-06-26 5.0
1 2000-02-02 2.0
2 1991-10-09 10.0
3 1993-01-02 10.0
4 1998-12-11 7.0
divorce.dtypes
marriage_date object
marriage_duration float64
dtype: object
divorce = pd.read_csv("divorce.csv", parse_dates=["marriage_date"])
divorce.dtypes
marriage_date datetime64[ns]
marriage_duration float64
dtype: object
pd.to_datetime()
converts arguments to DateTime data
divorce["marriage_date"] = pd.to_datetime(divorce["marriage_date"])
divorce.dtypes
marriage_date datetime64[ns]
marriage_duration float64
dtype: object
divorce.head(2)
month day year marriage_duration
0 6 26 2000 5.0
1 2 2 2000 2.0
divorce["marriage_date"] = pd.to_datetime(divorce[["month", "day", "year"]])
divorce.head(2)
month day year marriage_duration marriage_date
0 6 26 2000 5.0 2000-06-26
1 2 2 2000 2.0 2000-02-02
dt.month
, dt.day
, and dt.year
attributesdivorce["marriage_month"] = divorce["marriage_date"].dt.month
divorce.head()
marriage_date marriage_duration marriage_month
0 2000-06-26 5.0 6
1 2000-02-02 2.0 2
2 1991-10-09 10.0 10
3 1993-01-02 10.0 1
4 1998-12-11 7.0 12
sns.lineplot(data=divorce, x="marriage_month", y="marriage_duration")
plt.show()
Exploratory Data Analysis in Python