Mastering SVM Classifier on Iris Dataset

Introduction

The iris dataset is a classic dataset used for classification problems. In this lab, we will learn how to plot different SVM classifiers in the iris dataset using Python scikit-learn. We will compare different linear SVM classifiers on a 2D projection of the iris dataset.

VM Tips

After the VM startup is done, click the top left corner to switch to the Notebook tab to access Jupyter Notebook for practice.

Sometimes, you may need to wait a few seconds for Jupyter Notebook to finish loading. The validation of operations cannot be automated because of limitations in Jupyter Notebook.

If you face issues during learning, feel free to ask Labby. Provide feedback after the session, and we will promptly resolve the problem for you.

Skills Graph

%%%%{init: {'theme':'neutral'}}%%%% flowchart RL sklearn(("`Sklearn`")) -.-> sklearn/ModelSelectionandEvaluationGroup(["`Model Selection and Evaluation`"]) ml(("`Machine Learning`")) -.-> ml/FrameworkandSoftwareGroup(["`Framework and Software`"]) sklearn/ModelSelectionandEvaluationGroup -.-> sklearn/inspection("`Inspection`") ml/FrameworkandSoftwareGroup -.-> ml/sklearn("`scikit-learn`") subgraph Lab Skills sklearn/inspection -.-> lab-49170{{"`SVM Classifier on Iris Dataset`"}} ml/sklearn -.-> lab-49170{{"`SVM Classifier on Iris Dataset`"}} end

Import necessary libraries and load the dataset

import matplotlib.pyplot as plt
from sklearn import svm, datasets
from sklearn.inspection import DecisionBoundaryDisplay

## import some data to play with
iris = datasets.load_iris()
## Take the first two features. We could avoid this by using a two-dim dataset
X = iris.data[:, :2]
y = iris.target

Create SVM classifiers and fit the data

C = 1.0  ## SVM regularization parameter
models = (
    svm.SVC(kernel="linear", C=C),
    svm.LinearSVC(C=C, max_iter=10000, dual="auto"),
    svm.SVC(kernel="rbf", gamma=0.7, C=C),
    svm.SVC(kernel="poly", degree=3, gamma="auto", C=C),
)
models = (clf.fit(X, y) for clf in models)

Plot the decision surface for the classifiers

## Set-up 2x2 grid for plotting.
fig, sub = plt.subplots(2, 2)
plt.subplots_adjust(wspace=0.4, hspace=0.4)

X0, X1 = X[:, 0], X[:, 1]

## create a DecisionBoundaryDisplay for each classifier
for clf, title, ax in zip(models, titles, sub.flatten()):
    disp = DecisionBoundaryDisplay.from_estimator(
        clf,
        X,
        response_method="predict",
        cmap=plt.cm.coolwarm,
        alpha=0.8,
        ax=ax,
        xlabel=iris.feature_names[0],
        ylabel=iris.feature_names[1],
    )
    ## plot the data points
    ax.scatter(X0, X1, c=y, cmap=plt.cm.coolwarm, s=20, edgecolors="k")
    ax.set_xticks(())
    ax.set_yticks(())
    ax.set_title(title)

plt.show()

Interpret the results

The above code will generate a plot with four subplots. Each subplot shows the decision surface for a different SVM classifier. The title of each subplot indicates the type of SVM kernel used in that classifier. The data points are color-coded based on their target class.

Summary

In this lab, we learned how to plot different SVM classifiers in the iris dataset using Python scikit-learn. We compared different linear SVM classifiers on a 2D projection of the iris dataset and interpreted the results. SVM classifiers are powerful tools for classification problems and can be used for a wide range of datasets.

SVM Classifier on Iris Dataset