Survival Analysis in Python
Shae Wang
Senior Data Scientist
A mathematical function that describes the probability of different event outcomes.

A mathematical function that describes the probability of different event outcomes.

A continuous probability distribution that models time-to-event data very well (but originally applied to model particle size distribution).
$$f(x;\lambda,k)=\frac{k}{\lambda}\bigg(\frac{x}{\lambda}\bigg)^{k-1}e^{-(x/\lambda)^k}$$ $$x\geq0,k>0,\lambda>0$$
Determines the shape

Determines the scale

A company maintains a fleet of machines that are prone to failure...

A company maintains a fleet of machines that are prone to failure...

$$f(x;\lambda,k)=\frac{k}{\lambda}\bigg(\frac{x}{\lambda}\bigg)^{k-1}e^{-(x/\lambda)^k} \quad\rightarrow\quad\qquad\qquad S(t)=e^{-(t/\lambda)^\rho}$$

$\rho$ is same as k
$$f(x;\lambda,k)=\frac{k}{\lambda}\bigg(\frac{x}{\lambda}\bigg)^{k-1}e^{-(x/\lambda)^k} \quad\rightarrow\quad f(x;\lambda,k=3)=\frac{3}{\lambda}\bigg(\frac{x}{\lambda}\bigg)^2e^{-(x/\lambda)^3}$$



WeibullFitter classfrom lifelines import WeibullFitter
WeibullFitter classwb = WeibullFitter()
.fit() to fit the estimator to the datawb.fit(durations, event_observed)
.survival_function_, .lambda_, .rho_, .summary, .predict()DataFrame name: mortgage_df
| id | duration | paid_off |
|---|---|---|
| 1 | 25 | 0 |
| 2 | 17 | 1 |
| 3 | 5 | 0 |
| ... | ... | ... |
| 1000 | 30 | 1 |
from lifelines import WeibullFitter
wb = WeibullFitter()
wb.fit(durations=mortgage_df["duration"],
event_observed=mortgage_df["paid_off"])
wb.survival_function_.plot()
plt.show()

print(wb.lambda_, wb.rho_)
6.11 0.94
print(wb.predict(20))
0.05
Survival Analysis in Python