Survival Analysis in Python
Shae Wang
Senior Data Scientist
$$\Large{S(t) = Pr(T>t)}$$
$$\Large{S(t) = Pr(T>t)}$$
$$\Large{S(t) = Pr(T>t)}$$
$$\Large{S(t) = Pr(T>t)}$$
The lifelines
package is a complete survival analysis library.
import lifelines
import matplotlib.pyplot as plt
.fit(durations, event_observed)
.plot_survival_function()
DataFrame name: mortgage_df
id | duration | paid_off |
---|---|---|
1 | 25 | 0 |
2 | 17 | 1 |
3 | 5 | 0 |
... | ... | ... |
100 | 30 | 1 |
id
: the id of a mortgage loanduration
: the number of years the mortgage is not paid offpaid_off
: 1
if the mortgage is fully paid off, 0
if not fully paid offimport lifelines
from matplotlib import pyplot as plt
kmf = lifelines.KaplanMeierFitter()
kmf.fit(duration=mortgage_df["duration"],
event_observed=mortgage_df["paid_off"])
kmf.plot_survival_function()
plt.show()
Survival Analysis in Python