Analyzing Financial Statements in Python
Rohan Chatterjee
Risk Modeler
JSON ("JavaScript Object Notation") formatJSON JSON files into Python using pandascash_flow = pd.read_json("cash_flow_statement.json")
print(cash_flow.head())

Proportion of cash flow from operating activities to net income
Operating activities are core activities of the business
High ratio implies business generates sizable proportion of cash from operating activities
Formula:
$$\dfrac{\text{Cash flow from operating activities}}{\text{Net income}}$$
Formula:
$$\dfrac{\text{Cash flow from operating activities}}{\text{Current liabilities}}$$
dataset shown, some entries of "Total Current Liabilities" are missing, indicated by NaN
imputation = dataset.groupby("company")["Total Current Liabilities"].transform("mean")
dataset["Imputed Total Current Liabilities"] = dataset["Total Current Liabilities"].fillna(imputation)
dataset looks like:
Imputing a missing value with its 70th percentile worse non-missing value will give a more conservative imputation for it.
Computing ratios with the more conservative imputation might be more prudent if the ratio is to be used in decision-making.
company:imputation = dataset.groupby("company")["Total Current Liabilities"]\
.transform(lambda x: np.nanquantile(x, 0.7))
dataset["Imputed Total Current Liabilities"] = dataset["Total Current Liabilities"]\
.fillna(imputation)
Analyzing Financial Statements in Python