Survival Analysis in Python
Shae Wang
Senior Data Scientist
A mathematical function that describes the probability of different event outcomes.
A mathematical function that describes the probability of different event outcomes.
A continuous probability distribution that models time-to-event data very well (but originally applied to model particle size distribution).
$$f(x;\lambda,k)=\frac{k}{\lambda}\bigg(\frac{x}{\lambda}\bigg)^{k-1}e^{-(x/\lambda)^k}$$ $$x\geq0,k>0,\lambda>0$$
Determines the shape
Determines the scale
A company maintains a fleet of machines that are prone to failure...
A company maintains a fleet of machines that are prone to failure...
$$f(x;\lambda,k)=\frac{k}{\lambda}\bigg(\frac{x}{\lambda}\bigg)^{k-1}e^{-(x/\lambda)^k} \quad\rightarrow\quad\qquad\qquad S(t)=e^{-(t/\lambda)^\rho}$$
$\rho$ is same as k
$$f(x;\lambda,k)=\frac{k}{\lambda}\bigg(\frac{x}{\lambda}\bigg)^{k-1}e^{-(x/\lambda)^k} \quad\rightarrow\quad f(x;\lambda,k=3)=\frac{3}{\lambda}\bigg(\frac{x}{\lambda}\bigg)^2e^{-(x/\lambda)^3}$$
WeibullFitter
classfrom lifelines import WeibullFitter
WeibullFitter
classwb = WeibullFitter()
.fit()
to fit the estimator to the datawb.fit(durations, event_observed)
.survival_function_
, .lambda_
, .rho_
, .summary
, .predict()
DataFrame name: mortgage_df
id | duration | paid_off |
---|---|---|
1 | 25 | 0 |
2 | 17 | 1 |
3 | 5 | 0 |
... | ... | ... |
1000 | 30 | 1 |
from lifelines import WeibullFitter
wb = WeibullFitter()
wb.fit(durations=mortgage_df["duration"],
event_observed=mortgage_df["paid_off"])
wb.survival_function_.plot()
plt.show()
print(wb.lambda_, wb.rho_)
6.11 0.94
print(wb.predict(20))
0.05
Survival Analysis in Python