Analyzing Financial Statements in Python
Rohan Chatterjee
Risk Modeler
JSON
("JavaScript Object Notation") formatJSON
JSON
files into Python using pandas
cash_flow = pd.read_json("cash_flow_statement.json")
print(cash_flow.head())
Proportion of cash flow from operating activities to net income
Operating activities are core activities of the business
High ratio implies business generates sizable proportion of cash from operating activities
Formula:
$$\dfrac{\text{Cash flow from operating activities}}{\text{Net income}}$$
Formula:
$$\dfrac{\text{Cash flow from operating activities}}{\text{Current liabilities}}$$
dataset
shown, some entries of "Total Current Liabilities" are missing, indicated by NaN
imputation = dataset.groupby("company")["Total Current Liabilities"].transform("mean")
dataset["Imputed Total Current Liabilities"] = dataset["Total Current Liabilities"].fillna(imputation)
dataset
looks like:Imputing a missing value with its 70th percentile worse non-missing value will give a more conservative imputation for it.
Computing ratios with the more conservative imputation might be more prudent if the ratio is to be used in decision-making.
company
:imputation = dataset.groupby("company")["Total Current Liabilities"]\
.transform(lambda x: np.nanquantile(x, 0.7))
dataset["Imputed Total Current Liabilities"] = dataset["Total Current Liabilities"]\
.fillna(imputation)
Analyzing Financial Statements in Python