Python Encyclopedia for Academics
  • Course Outline
  • Artificial Intelligence
    • Data Science Foundation
      • Python Programming
        • Introduction and Basics
          • Variables
          • Print Function
          • Input From User
          • Data Types
          • Type Conversion
        • Operators
          • Arithmetic Operators
          • Relational Operators
          • Bitwise Operators
          • Logical Operators
          • Assignment Operators
          • Compound Operators
          • Membership Operators
          • Identity Operators
      • Numpy
        • Vectors, Matrix
        • Operations on Matrix
        • Mean, Variance, and Standard Deviation
        • Reshaping Arrays
        • Transpose and Determinant of Matrix
      • Pandas
        • Series and DataFrames
        • Slicing, Rows, and Columns
        • Operations on DataFrames
        • Different wayes to creat DataFrame
        • Read, Write Operations with CSV files
      • Matplotlib
        • Graph Basics
        • Format Strings in Plots
        • Label Parameters, Legend
        • Bar Chart, Pie Chart, Histogram, and Scatter Plot
  • Machine Learning Algorithms
    • Regression Analysis In ML
      • Regression Analysis in Machine Learning
      • Proof of Linear Regression Formulas
      • Simple Linear Regression Implementation
      • Multiple Linear Regression
      • Advertising Dataset Example
      • Bike Sharing Dataset
      • Wine Quality Dataset
      • Auto MPG Dataset
    • Classification Algorithms in ML
      • Proof of Logistic Regression
      • Simplified Mathematical Proof of SVM
      • Iris Dataset
  • Machine Learning Laboratory
    • Lab 1: Titanic Dataset
      • Predicting Survival on the Titanic with Machine Learning
    • Lab 2: Dow Jones Index Dataset
      • Dow Jones Index Predictions Using Machine Learning
    • Lab 3: Diabetes Dataset
      • Numpy
      • Pandas
      • Matplotlib
      • Simple Linear Regression
      • Simple Non-linear Regression
      • Performance Matrix
      • Preprocessing
      • Naive Bayes Classification
      • K-Nearest Neighbors (KNN) Classification
      • Decision Tree & Random Forest
      • SVM Classifier
      • Logistic Regression
      • Artificial Neural Network
      • K means Clustering
    • Lab 4: MAGIC Gamma Telescope Dataset
      • Classification in ML-MAGIC Gamma Telescope Dataset
    • Lab 5: Seoul Bike Sharing Demand Dataset
      • Regression in ML-Seoul Bike Sharing Demand Dataset
    • Lab 6: Medical Cost Personal Datasets
      • Predict Insurance Costs with Linear Regression in Python
    • Lab 6: Predict The S&P 500 Index With Machine Learning And Python
      • Predict The S&P 500 Index With Machine Learning And Python
  • Artificial Neural Networks
    • Biological Inspiration vs. Artificial Neurons
    • Review linear algebra and calculus essentials for ANNs
    • Activation Function
  • Mathematics
    • Pre-Calculus
      • Factorials
      • Roots of Polynomials
      • Complex Numbers
      • Polar Coordinates
      • Graph of a Function
    • Calculus 1
      • Limit of a Function
      • Derivative of Function
      • Critical Points
      • Indefinite Integrals
  • Calculus 2
    • 3D Coordinates and Vectors
    • Vectors and Vector Operations
    • Lines and Planes in Space (3D)
    • Partial Derivatives
    • Optimization Problems (Maxima/Minima) in Multivariable Functions
    • Gradient Vectors
  • Engineering Mathematics
    • Laplace Transform
  • Electrical & electronics Eng
    • Resistor
      • Series Resistors
      • Parallel Resistors
    • Nodal Analysis
      • Example 1
      • Example 2
    • Transient State
      • RC Circuit Equations in the s-Domain
      • RL Circuit Equations in the s-Domain
      • LC Circuit Equations in the s-Domain
      • Series RLC Circuit with DC Source
  • Computer Networking
    • Fundamental
      • IPv4 Addressing
      • Network Diagnostics
  • Cybersecurity
    • Classical Ciphers
      • Caesar Cipher
      • Affine Cipher
      • Atbash Cipher
      • Vigenère Cipher
      • Gronsfeld Cipher
      • Alberti Cipher
      • Hill Cipher
Powered by GitBook
On this page
  • Part 1: Simplified Mathematical Proof of SVM
  • Part 2: Simple Numerical Example (Step by Step)
  • Step-by-Step Python Code: SVM Example
  1. Machine Learning Algorithms
  2. Classification Algorithms in ML

Simplified Mathematical Proof of SVM

Nerd Cafe

Part 1: Simplified Mathematical Proof of SVM

Goal of SVM

Imagine you have two kinds of points on a 2D graph:

  • 🟢 Class +1

  • 🔴 Class -1

The goal of SVM is to draw the best straight line that:

  1. Separates the two classes

  2. Is as far away as possible from the closest points

This line is called the decision boundary, and the closest points are called support vectors.

Step 1: Define the Line

We define the separating line (or hyperplane) as:

WTX+b=0W^{T}X+b=0WTX+b=0

Where:

  • WTW_{T}WT​ is the weight vector (slope of the line)

  • bbb is the bias (how far the line is from the origin)

  • XXX is the input point (like (2, 3), etc.)

Step 2: Set the Condition for Correct Classification

We want:

  • If y=+1y=+1y=+1, the point is above the line

WTX+b≥1W^{T}X+b\ge 1WTX+b≥1
  • If y=−1y=−1y=−1, the point is below the line

WTX+b≤−1W^{T}X+b\le -1WTX+b≤−1

Combine both:

yi(WTX+b)≥1y_{i}\left( W^{T}X+b \right)\ge 1yi​(WTX+b)≥1

Step 3: Maximize the Margin

The margin is the distance from the line to the closest point. SVM wants to maximize this margin. The margin is:

Margin=2∥W∥Margin=\frac{2}{\left\| W \right\|}Margin=∥W∥2​

To maximize this, we minimize:

12∥W∥2\frac{1}{2}\left\| W \right\|^{2}21​∥W∥2

Subject to:

yi(WTxi+b)≥1y_{i}\left( W^{T}x_{i}+b \right)\ge 1yi​(WTxi​+b)≥1

That’s the core idea of SVM!

Part 2: Simple Numerical Example (Step by Step)

Data

Point
x i ​=(x1 , x2 )
yi

A

(1, 1)

+1

B

(2, 2)

+1

C

(2, 0)

-1

D

(0, 0)

-1

We want to draw a line that separates the +1 and -1 classes.

Step 1: Assume Solution (try values for 𝑤 and 𝑏)

Let’s guess a line:

W=(1,−1),b=0W=(1,−1),b=0W=(1,−1),b=0

So the equation of the line is:

x1−x2=0    (or)    x2=x1x_{1}−x_{2}=0 \;\;(or)\;\;x_{2}=x_{1}x1​−x2​=0(or)x2​=x1​

Step 2: Plug into the condition yi(WTxi+b)≥1y_{i}\left( W^{T}x_{i}+b \right)\ge 1yi​(WTxi​+b)≥1

Check all points:

Point
wT.xi ​+b
Result yi ​× (⋅)

A (1,1), +1

1×1−1×1=0

(+1)×0=0 ❌ Not ≥ 1

B (2,2), +1

2−2=0

(+1)×0=0 ❌

C (2,0), -1

2−0=2

(−1)×2=−2 ❌

D (0,0), -1

0−0=0

(−1)×0=0 ❌

So our guess was wrong.

Step 3: Try Better Line

Try:

W=(1,1),b=−3W=(1,1),b=−3W=(1,1),b=−3

So the line is:

x1+x2=3x_{1}+x_{2}=3x1​+x2​=3

Check:

Point
x1 ​ +x2 ​
yi ​ (⋅+b)

A (1,1), +1

1+1=2

1(2−3)=−1 ❌

B (2,2), +1

2+2=0

1(4−3)=1 ✅

C (2,0), -1

2+0=0

−1(2−3)=+1 ✅

D (0,0), -1

0+0=0

−1(0−3)=3 ✅

So only point A fails. Almost correct!

Step-by-Step Python Code: SVM Example

Let’s implement a simple Support Vector Machine (SVM) from scratch using Python and NumPy, step by step, based on the math we discussed.

We’ll:

  1. Create a small dataset.

  2. Visualize it.

  3. Train a simple linear SVM using sklearn.

  4. Show the decision boundary and support vectors.

Step 1: Import Libraries

import numpy as np
import matplotlib.pyplot as plt
from sklearn.svm import SVC

Step 2: Define a Small Dataset

# Data points (features)
X = np.array([
    [1, 1],   # class +1
    [2, 2],   # class +1
    [2, 0],   # class -1
    [0, 0],   # class -1
])

# Labels
y = np.array([1, 1, -1, -1])

Step 3: Train the SVM Model

# Create a linear SVM classifier
svm = SVC(kernel='linear', C=1.0)

# Fit the model
svm.fit(X, y)

Step 4: Plot the Data, Decision Boundary, and Support Vectors

# Plotting
plt.figure(figsize=(8, 6))

# Plot data points
plt.scatter(X[:, 0], X[:, 1], c=y, cmap='bwr', s=100, edgecolors='k')

# Plot support vectors
plt.scatter(
    svm.support_vectors_[:, 0],
    svm.support_vectors_[:, 1],
    s=200, facecolors='none', edgecolors='k', linewidths=2, label='Support Vectors'
)

# Extract the separating hyperplane
w = svm.coef_[0]
b = svm.intercept_[0]

# Plot the decision boundary
x_plot = np.linspace(-1, 3, 100)
y_plot = -(w[0] * x_plot + b) / w[1]
plt.plot(x_plot, y_plot, 'k-', label='Decision Boundary')

# Margins
margin = 1 / np.sqrt(np.sum(w ** 2))
y_margin_up = y_plot + margin
y_margin_down = y_plot - margin
plt.plot(x_plot, y_margin_up, 'k--', linewidth=1)
plt.plot(x_plot, y_margin_down, 'k--', linewidth=1)

# Labels and legend
plt.title("SVM with Linear Kernel")
plt.xlabel("x1")
plt.ylabel("x2")
plt.legend()
plt.grid(True)
plt.show()

Output

Explanation

  • svm.support_vectors_: The actual support vectors found by the algorithm.

  • svm.coef_: The learned weight vector WWW.

  • svm.intercept_: The learned bias bbb.

  • The dashed lines are the margins (distance from the decision boundary).

  • The solid black line is the separating hyperplane.

Step 5: Print the Model Parameters

Add this code after training the SVM:

# Print weight vector w and bias b
print("Weight vector w:", svm.coef_)
print("Bias term b:", svm.intercept_)

# Print the support vectors
print("Support Vectors:\n", svm.support_vectors_)

# Print the indices of support vectors
print("Indices of Support Vectors:", svm.support_)

# Print the dual coefficients
print("Dual Coefficients (α_i * y_i):", svm.dual_coef_)

Output

Weight vector w: [[0. 1.]]
Bias term b: [-1.]
Support Vectors:
 [[2. 0.]
 [0. 0.]
 [1. 1.]]
Indices of Support Vectors: [2 3 0]
Dual Coefficients (α_i * y_i): [[-0.5 -0.5  1. ]]

PreviousProof of Logistic RegressionNextMachine Learning Laboratory

Last updated 22 days ago