Simplified Mathematical Proof of SVM

Nerd Cafe

Part 1: Simplified Mathematical Proof of SVM

Goal of SVM

Imagine you have two kinds of points on a 2D graph:

🟢 Class +1
🔴 Class -1

The goal of SVM is to draw the best straight line that:

Separates the two classes
Is as far away as possible from the closest points

This line is called the decision boundary, and the closest points are called support vectors.

Step 1: Define the Line

We define the separating line (or hyperplane) as:

W^{T}X+b=0

Where:

$W_{T}$ is the weight vector (slope of the line)
$b$ is the bias (how far the line is from the origin)
$X$ is the input point (like (2, 3), etc.)

Step 2: Set the Condition for Correct Classification

We want:

If $y=+1$ , the point is above the line

W^{T}X+b\ge 1

If $y=−1$ , the point is below the line

W^{T}X+b\le -1

Combine both:

y_{i}\left( W^{T}X+b \right)\ge 1

Step 3: Maximize the Margin

The margin is the distance from the line to the closest point. SVM wants to maximize this margin. The margin is:

Margin=\frac{2}{\left\| W \right\|}

To maximize this, we minimize:

\frac{1}{2}\left\| W \right\|^{2}

Subject to:

y_{i}\left( W^{T}x_{i}+b \right)\ge 1

That’s the core idea of SVM!

Part 2: Simple Numerical Example (Step by Step)

Data

Point

x i =(x1 , x2 )

(1, 1)

(2, 2)

(2, 0)

-1

(0, 0)

-1

We want to draw a line that separates the +1 and -1 classes.

Step 1: Assume Solution (try values for 𝑤 and 𝑏)

Let’s guess a line:

W=(1,−1),b=0

So the equation of the line is:

x_{1}−x_{2}=0 \;\;(or)\;\;x_{2}=x_{1}

Step 2: Plug into the condition $y_{i}\left( W^{T}x_{i}+b \right)\ge 1$

Check all points:

Point

wT.xi +b

Result yi × (⋅)

A (1,1), +1

1×1−1×1=0

(+1)×0=0 ❌ Not ≥ 1

B (2,2), +1

2−2=0

(+1)×0=0 ❌

C (2,0), -1

2−0=2

(−1)×2=−2 ❌

D (0,0), -1

0−0=0

(−1)×0=0 ❌

So our guess was wrong.

Step 3: Try Better Line

Try:

W=(1,1),b=−3

So the line is:

x_{1}+x_{2}=3

Check:

Point

x1 +x2

yi (⋅+b)

A (1,1), +1

1+1=2

1(2−3)=−1 ❌

B (2,2), +1

2+2=0

1(4−3)=1 ✅

C (2,0), -1

2+0=0

−1(2−3)=+1 ✅

D (0,0), -1

0+0=0

−1(0−3)=3 ✅

So only point A fails. Almost correct!

Step-by-Step Python Code: SVM Example

Let’s implement a simple Support Vector Machine (SVM) from scratch using Python and NumPy, step by step, based on the math we discussed.

We’ll:

Create a small dataset.
Visualize it.
Train a simple linear SVM using sklearn.
Show the decision boundary and support vectors.

Step 1: Import Libraries

import numpy as np
import matplotlib.pyplot as plt
from sklearn.svm import SVC

Step 2: Define a Small Dataset

# Data points (features)
X = np.array([
    [1, 1],   # class +1
    [2, 2],   # class +1
    [2, 0],   # class -1
    [0, 0],   # class -1
])

# Labels
y = np.array([1, 1, -1, -1])

Step 3: Train the SVM Model

# Create a linear SVM classifier
svm = SVC(kernel='linear', C=1.0)

# Fit the model
svm.fit(X, y)

Step 4: Plot the Data, Decision Boundary, and Support Vectors

# Plotting
plt.figure(figsize=(8, 6))

# Plot data points
plt.scatter(X[:, 0], X[:, 1], c=y, cmap='bwr', s=100, edgecolors='k')

# Plot support vectors
plt.scatter(
    svm.support_vectors_[:, 0],
    svm.support_vectors_[:, 1],
    s=200, facecolors='none', edgecolors='k', linewidths=2, label='Support Vectors'
)

# Extract the separating hyperplane
w = svm.coef_[0]
b = svm.intercept_[0]

# Plot the decision boundary
x_plot = np.linspace(-1, 3, 100)
y_plot = -(w[0] * x_plot + b) / w[1]
plt.plot(x_plot, y_plot, 'k-', label='Decision Boundary')

# Margins
margin = 1 / np.sqrt(np.sum(w ** 2))
y_margin_up = y_plot + margin
y_margin_down = y_plot - margin
plt.plot(x_plot, y_margin_up, 'k--', linewidth=1)
plt.plot(x_plot, y_margin_down, 'k--', linewidth=1)

# Labels and legend
plt.title("SVM with Linear Kernel")
plt.xlabel("x1")
plt.ylabel("x2")
plt.legend()
plt.grid(True)
plt.show()

Output

Explanation

svm.support_vectors_: The actual support vectors found by the algorithm.
svm.coef_: The learned weight vector $W$ .
svm.intercept_: The learned bias $b$ .
The dashed lines are the margins (distance from the decision boundary).
The solid black line is the separating hyperplane.

Step 5: Print the Model Parameters

Add this code after training the SVM:

# Print weight vector w and bias b
print("Weight vector w:", svm.coef_)
print("Bias term b:", svm.intercept_)

# Print the support vectors
print("Support Vectors:\n", svm.support_vectors_)

# Print the indices of support vectors
print("Indices of Support Vectors:", svm.support_)

# Print the dual coefficients
print("Dual Coefficients (α_i * y_i):", svm.dual_coef_)

Output

Weight vector w: [[0. 1.]]
Bias term b: [-1.]
Support Vectors:
 [[2. 0.]
 [0. 0.]
 [1. 1.]]
Indices of Support Vectors: [2 3 0]
Dual Coefficients (α_i * y_i): [[-0.5 -0.5  1. ]]

PreviousProof of Logistic Regression NextMachine Learning Laboratory

Last updated 22 days ago

# Plotting plt.figure(figsize=(8, 6)) # Plot data points plt.scatter(X[:, 0], X[:, 1], c=y, cmap='bwr', s=100, edgecolors='k') # Plot support vectors plt.scatter( svm.support_vectors_[:, 0], svm.support_vectors_[:, 1], s=200, facecolors='none', edgecolors='k', linewidths=2, label='Support Vectors' ) # Extract the separating hyperplane w = svm.coef_[0] b = svm.intercept_[0] # Plot the decision boundary x_plot = np.linspace(-1, 3, 100) y_plot = -(w[0] * x_plot + b) / w[1] plt.plot(x_plot, y_plot, 'k-', label='Decision Boundary') # Margins margin = 1 / np.sqrt(np.sum(w ** 2)) y_margin_up = y_plot + margin y_margin_down = y_plot - margin plt.plot(x_plot, y_margin_up, 'k--', linewidth=1) plt.plot(x_plot, y_margin_down, 'k--', linewidth=1) # Labels and legend plt.title("SVM with Linear Kernel") plt.xlabel("x1") plt.ylabel("x2") plt.legend() plt.grid(True) plt.show()

# Print weight vector w and bias b print("Weight vector w:", svm.coef_) print("Bias term b:", svm.intercept_) # Print the support vectors print("Support Vectors:\n", svm.support_vectors_) # Print the indices of support vectors print("Indices of Support Vectors:", svm.support_) # Print the dual coefficients print("Dual Coefficients (α_i * y_i):", svm.dual_coef_)

Part 1: Simplified Mathematical Proof of SVM

Goal of SVM

Step 1: Define the Line

Step 2: Set the Condition for Correct Classification

Step 3: Maximize the Margin

Part 2: Simple Numerical Example (Step by Step)

Data

Step 1: Assume Solution (try values for 𝑤 and 𝑏)

Step 2: Plug into the condition yi(WTxi+b)≥1y_{i}\left( W^{T}x_{i}+b \right)\ge 1yi​(WTxi​+b)≥1

Step 3: Try Better Line

Step-by-Step Python Code: SVM Example

Step 1: Import Libraries

Step 2: Define a Small Dataset

Step 3: Train the SVM Model

Step 4: Plot the Data, Decision Boundary, and Support Vectors

Output

Explanation

Step 5: Print the Model Parameters

Output

Part 1: Simplified Mathematical Proof of SVM

Goal of SVM

Step 1: Define the Line

Step 2: Set the Condition for Correct Classification

Step 3: Maximize the Margin

Part 2: Simple Numerical Example (Step by Step)

Data

Step 1: Assume Solution (try values for 𝑤 and 𝑏)

Step 2: Plug into the condition yi(WTxi+b)≥1y_{i}\left( W^{T}x_{i}+b \right)\ge 1yi​(WTxi​+b)≥1

Step 3: Try Better Line

Step-by-Step Python Code: SVM Example

Step 1: Import Libraries

Step 2: Define a Small Dataset

Step 3: Train the SVM Model

Step 4: Plot the Data, Decision Boundary, and Support Vectors

Output

Explanation

Step 5: Print the Model Parameters

Output

Step 2: Plug into the condition $y_{i}\left( W^{T}x_{i}+b \right)\ge 1$

Step 2: Plug into the condition $y_{i}\left( W^{T}x_{i}+b \right)\ge 1$