Intermediate Regression with statsmodels in Python
Maarten Van den Broeck
Content Developer at DataCamp
A line plot of a quadratic equation
x = np.arange(-4, 5, 0.1)
y = x ** 2 - x + 10
xy_data = pd.DataFrame({"x": x,
"y": y})
sns.lineplot(x="x",
y="y",
data=xy_data)
$y = x ^ 2 - x + 10$
$\frac{\partial y}{\partial x} = 2 x - 1$
$0 = 2 x - 1$
$x = 0.5$
$y = 0.5 ^ 2 - 0.5 + 10 = 9.75$
Don't worry if this doesn't make sense, you won't need it for the exercises.
from scipy.optimize import minimize
def calc_quadratic(x):
y = x ** 2 - x + 10
return y
minimize(fun=calc_quadratic,
x0=3)
fun: 9.75
hess_inv: array([[0.5]])
jac: array([0.])
message: 'Optimization terminated successfully.'
nfev: 6
nit: 2
njev: 3
status: 0
success: True
x: array([0.49999998])
Define a function to calculate the sum of squares metric.
Call minimize()
to find coefficients that minimize this function.
def calc_sum_of_squares(coeffs):
intercept, slope = coeffs
# More calculation!
minimize(
fun=calc_sum_of_squares,
x0=0
)
Intermediate Regression with statsmodels in Python