Goal: Find The Best Fit Line
We want to model a linear relationship between variables:
yi^=mxi+b Where:
yi^ is the predicted value,
xi is the observed input (independent variable),
yi is the actual output (dependent variable),
Objective: Minimize the Total Squared Error
The error (residual) for each point is:
ei=yi−yi^=yi−(mxi+b) We want to minimize the sum of squared errors:
E=i=1∑n(yi−(mxi+b))2 Step 1: Minimize Error Function (E)
We treat E as a function of m and b :
E(m,b)=i=1∑n(yi−mxi−b)2 To minimize E, take partial derivatives of E with respect to m and b, and set them to zero.
Step 2: Partial Derivative with Respect to m
∂m∂E=∂m∂i=1∑n(yi−mxi−b)2 Use the chain rule:
=i=1∑n(2)(−xi)(yi−mxi−b)=−2i=1∑nxi(yi−mxi−b) Set this derivative to 0:
−2i=1∑nxi(yi−mxi−b)=0⇒i=1∑nxi(yi−mxi−b)=0(1) Step 3: Partial Derivative with Respect to 𝑏
∂b∂E=∂b∂i=1∑n(yi−mxi−b)2 Use the chain rule:
i=1∑n(2)(yi−mxi−b)(−1)=−2i=1∑n(yi−mxi−b) Set this to zero:
i=1∑n(yi−mxi−b)=0(2) Step 4: Solve the System of Equations
Equation (2):
i=1∑n(yi−mxi−b)=0⇒i=1∑nyi−mi=1∑nxi−nb=0⇒b=n∑i=1nyi−m∑i=1nxi Plug last into (1):
Equation (1) becomes:
∑xiyi−m∑xi2−b∑xi=0 Substitute 𝑏 from equation (3):
∑xiyi−m∑xi2−(n∑yi−m∑xi)∑xi=0 Multiply the right-hand term:
∑xiyi−m∑xi2−n∑xi∑yi+mn(∑xi)2=0 Now collect terms with 𝑚 together and simplify:
m(n(∑xi)2−∑xi2)=n(∑xi)(∑yi)−∑xiyi Multiply both sides by −1 to clean the left-hand term:
m=n∑xi2−(∑xi)2n∑xiyi−∑xi∑yi Slope:
m=n∑xi2−(∑xi)2n∑xiyi−∑xi∑yi Intercept:
b=n∑i=1nyi−m∑i=1nxi