Python Encyclopedia for Academics
  • Course Outline
  • Artificial Intelligence
    • Data Science Foundation
      • Python Programming
        • Introduction and Basics
          • Variables
          • Print Function
          • Input From User
          • Data Types
          • Type Conversion
        • Operators
          • Arithmetic Operators
          • Relational Operators
          • Bitwise Operators
          • Logical Operators
          • Assignment Operators
          • Compound Operators
          • Membership Operators
          • Identity Operators
      • Numpy
        • Vectors, Matrix
        • Operations on Matrix
        • Mean, Variance, and Standard Deviation
        • Reshaping Arrays
        • Transpose and Determinant of Matrix
      • Pandas
        • Series and DataFrames
        • Slicing, Rows, and Columns
        • Operations on DataFrames
        • Different wayes to creat DataFrame
        • Read, Write Operations with CSV files
      • Matplotlib
        • Graph Basics
        • Format Strings in Plots
        • Label Parameters, Legend
        • Bar Chart, Pie Chart, Histogram, and Scatter Plot
  • Machine Learning Algorithms
    • Regression Analysis In ML
      • Regression Analysis in Machine Learning
      • Proof of Linear Regression Formulas
      • Simple Linear Regression Implementation
      • Multiple Linear Regression
      • Advertising Dataset Example
      • Bike Sharing Dataset
      • Wine Quality Dataset
      • Auto MPG Dataset
    • Classification Algorithms in ML
      • Proof of Logistic Regression
      • Simplified Mathematical Proof of SVM
      • Iris Dataset
  • Machine Learning Laboratory
    • Lab 1: Titanic Dataset
      • Predicting Survival on the Titanic with Machine Learning
    • Lab 2: Dow Jones Index Dataset
      • Dow Jones Index Predictions Using Machine Learning
    • Lab 3: Diabetes Dataset
      • Numpy
      • Pandas
      • Matplotlib
      • Simple Linear Regression
      • Simple Non-linear Regression
      • Performance Matrix
      • Preprocessing
      • Naive Bayes Classification
      • K-Nearest Neighbors (KNN) Classification
      • Decision Tree & Random Forest
      • SVM Classifier
      • Logistic Regression
      • Artificial Neural Network
      • K means Clustering
    • Lab 4: MAGIC Gamma Telescope Dataset
      • Classification in ML-MAGIC Gamma Telescope Dataset
    • Lab 5: Seoul Bike Sharing Demand Dataset
      • Regression in ML-Seoul Bike Sharing Demand Dataset
    • Lab 6: Medical Cost Personal Datasets
      • Predict Insurance Costs with Linear Regression in Python
    • Lab 6: Predict The S&P 500 Index With Machine Learning And Python
      • Predict The S&P 500 Index With Machine Learning And Python
  • Artificial Neural Networks
    • Biological Inspiration vs. Artificial Neurons
    • Review linear algebra and calculus essentials for ANNs
    • Activation Function
  • Mathematics
    • Pre-Calculus
      • Factorials
      • Roots of Polynomials
      • Complex Numbers
      • Polar Coordinates
      • Graph of a Function
    • Calculus 1
      • Limit of a Function
      • Derivative of Function
      • Critical Points
      • Indefinite Integrals
  • Calculus 2
    • 3D Coordinates and Vectors
    • Vectors and Vector Operations
    • Lines and Planes in Space (3D)
    • Partial Derivatives
    • Optimization Problems (Maxima/Minima) in Multivariable Functions
    • Gradient Vectors
  • Engineering Mathematics
    • Laplace Transform
  • Electrical & electronics Eng
    • Resistor
      • Series Resistors
      • Parallel Resistors
    • Nodal Analysis
      • Example 1
      • Example 2
    • Transient State
      • RC Circuit Equations in the s-Domain
      • RL Circuit Equations in the s-Domain
      • LC Circuit Equations in the s-Domain
      • Series RLC Circuit with DC Source
  • Computer Networking
    • Fundamental
      • IPv4 Addressing
      • Network Diagnostics
  • Cybersecurity
    • Classical Ciphers
      • Caesar Cipher
      • Affine Cipher
      • Atbash Cipher
      • Vigenère Cipher
      • Gronsfeld Cipher
      • Alberti Cipher
      • Hill Cipher
Powered by GitBook
On this page
  • Goal: Find The Best Fit Line
  • Objective: Minimize the Total Squared Error
  • Step 1: Minimize Error Function (E)
  • Step 2: Partial Derivative with Respect to m
  • Step 3: Partial Derivative with Respect to 𝑏
  • Step 4: Solve the System of Equations
  • Final Formulas
  1. Machine Learning Algorithms
  2. Regression Analysis In ML

Proof of Linear Regression Formulas

Nerd Cafe

Goal: Find The Best Fit Line

We want to model a linear relationship between variables:

yi^=mxi+b\hat{y_{i}}=mx_{i}+byi​^​=mxi​+b

Where:

  • yi^\hat{y_{i}}yi​^​ is the predicted value,

  • xix_{i}xi​ is the observed input (independent variable),

  • yiy_{i}yi​​ is the actual output (dependent variable),

  • mmm is the slope,

  • bbb is the intercept.

Objective: Minimize the Total Squared Error

The error (residual) for each point is:

ei=yi−yi^=yi−(mxi+b)e_{i}=y_{i}-\hat{y_{i}}=y_{i}-(mx_{i}+b)ei​=yi​−yi​^​=yi​−(mxi​+b)

We want to minimize the sum of squared errors:

E=∑i=1n(yi−(mxi+b))2E=\sum_{i=1}^{n}(y_{i}-(mx_{i}+b))^{2}E=i=1∑n​(yi​−(mxi​+b))2

Step 1: Minimize Error Function (E)

We treat EEE as a function of mmm and bbb :

E(m,b)=∑i=1n(yi−mxi−b)2E(m,b)=\sum_{i=1}^{n}(y_{i}-mx_{i}-b)^{2}E(m,b)=i=1∑n​(yi​−mxi​−b)2

To minimize EEE, take partial derivatives of EEE with respect to mmm and bbb, and set them to zero.

Step 2: Partial Derivative with Respect to m

∂E∂m=∂∂m∑i=1n(yi−mxi−b)2\frac{\partial E}{\partial m}=\frac{\partial }{\partial m}\sum_{i=1}^{n}(y_{i}-mx_{i}-b)^{2}∂m∂E​=∂m∂​i=1∑n​(yi​−mxi​−b)2

Use the chain rule:

=∑i=1n(2)(−xi)(yi−mxi−b)=−2∑i=1nxi(yi−mxi−b)=\sum_{i=1}^{n}(2)(-x_{i})(y_{i}-mx_{i}-b)=-2\sum_{i=1}^{n}x_{i}(y_{i}-mx_{i}-b)=i=1∑n​(2)(−xi​)(yi​−mxi​−b)=−2i=1∑n​xi​(yi​−mxi​−b)

Set this derivative to 0:

−2∑i=1nxi(yi−mxi−b)=0⇒∑i=1nxi(yi−mxi−b)=0        (1)-2\sum_{i=1}^{n}x_{i}(y_{i}-mx_{i}-b)=0\Rightarrow \sum_{i=1}^{n}x_{i}(y_{i}-mx_{i}-b)=0\;\;\;\;(1)−2i=1∑n​xi​(yi​−mxi​−b)=0⇒i=1∑n​xi​(yi​−mxi​−b)=0(1)

Step 3: Partial Derivative with Respect to 𝑏

∂E∂b=∂∂b∑i=1n(yi−mxi−b)2\frac{\partial E}{\partial b}=\frac{\partial }{\partial b}\sum_{i=1}^{n}(y_{i}-mx_{i}-b)^{2}∂b∂E​=∂b∂​i=1∑n​(yi​−mxi​−b)2

Use the chain rule:

∑i=1n(2)(yi−mxi−b)(−1)=−2∑i=1n(yi−mxi−b)\sum_{i=1}^{n}(2)(y_{i}-mx_{i}-b)(-1)=-2\sum_{i=1}^{n}(y_{i}-mx_{i}-b)i=1∑n​(2)(yi​−mxi​−b)(−1)=−2i=1∑n​(yi​−mxi​−b)

Set this to zero:

∑i=1n(yi−mxi−b)=0        (2)\sum_{i=1}^{n}(y_{i}-mx_{i}-b)=0\;\;\;\;(2)i=1∑n​(yi​−mxi​−b)=0(2)

Step 4: Solve the System of Equations

Equation (2):

∑i=1n(yi−mxi−b)=0⇒∑i=1nyi−m∑i=1nxi−nb=0⇒b=∑i=1nyi−m∑i=1nxin\sum_{i=1}^{n}(y_{i}-mx_{i}-b)=0\Rightarrow \sum_{i=1}^{n}y_{i}-m\sum_{i=1}^{n}x_{i}-nb=0\Rightarrow b=\frac{\sum_{i=1}^{n}y_{i}-m\sum_{i=1}^{n}x_{i}}{n}i=1∑n​(yi​−mxi​−b)=0⇒i=1∑n​yi​−mi=1∑n​xi​−nb=0⇒b=n∑i=1n​yi​−m∑i=1n​xi​​

Plug last into (1):

Equation (1) becomes:

∑xiyi−m∑xi2−b∑xi=0\sum_{}^{}x_{i}y_{i}-m\sum_{}^{}x_{i}^{2}-b\sum_{}^{}x_{i}=0∑​xi​yi​−m∑​xi2​−b∑​xi​=0

Substitute 𝑏 from equation (3):

∑xiyi−m∑xi2−(∑yi−m∑xin)∑xi=0\sum_{}^{}x_{i}y_{i}-m\sum_{}^{}x_{i}^{2}-(\frac{\sum_{}^{}y_{i}-m\sum_{}^{}x_{i}}{n})\sum_{}^{}x_{i}=0∑​xi​yi​−m∑​xi2​−(n∑​yi​−m∑​xi​​)∑​xi​=0

Multiply the right-hand term:

∑xiyi−m∑xi2−∑xi∑yin+m(∑xi)2n=0\sum_{}^{}x_{i}y_{i}-m\sum_{}^{}x_{i}^{2}-\frac{\sum_{}^{}x_{i}\sum_{}^{}y_{i}}{n}+m\frac{(\sum_{}^{}x_{i})^{2}}{n}=0 ∑​xi​yi​−m∑​xi2​−n∑​xi​∑​yi​​+mn(∑​xi​)2​=0

Now collect terms with 𝑚 together and simplify:

m((∑xi)2n−∑xi2)=(∑xi)(∑yi)n−∑xiyim(\frac{\left( \sum_{}^{}x_{i} \right)^{2}}{n}-\sum_{}x_{i}^{2})=\frac{(\sum_{}^{}x_{i})(\sum_{}^{}y_{i})}{n}-\sum_{}^{}x_{i}y_{i} m(n(∑​xi​)2​−∑​xi2​)=n(∑​xi​)(∑​yi​)​−∑​xi​yi​

Multiply both sides by −1 to clean the left-hand term:

m=n∑xiyi−∑xi∑yin∑xi2−(∑xi)2m=\frac{n\sum_{}^{}x_{i}y_{i}-\sum_{}^{}x_{i}\sum_{}^{}y_{i}}{n\sum_{}^{}x_{i}^{2}-(\sum_{}^{}x_{i})^{2}} m=n∑​xi2​−(∑​xi​)2n∑​xi​yi​−∑​xi​∑​yi​​

Final Formulas

Slope:

m=n∑xiyi−∑xi∑yin∑xi2−(∑xi)2m=\frac{n\sum_{}^{}x_{i}y_{i}-\sum_{}^{}x_{i}\sum_{}^{}y_{i}}{n\sum_{}^{}x_{i}^{2}-(\sum_{}^{}x_{i})^{2}} m=n∑​xi2​−(∑​xi​)2n∑​xi​yi​−∑​xi​∑​yi​​

Intercept:

b=∑i=1nyi−m∑i=1nxinb=\frac{\sum_{i=1}^{n}y_{i}-m\sum_{i=1}^{n}x_{i}}{n}b=n∑i=1n​yi​−m∑i=1n​xi​​
PreviousRegression Analysis in Machine LearningNextSimple Linear Regression Implementation

Last updated 24 days ago