Plotting Decision Functions | Weighted Datasets | Scikit-Learn

Introduction

In this tutorial, we will learn how to plot a decision function of a weighted dataset using scikit-learn. We will also learn how to assign different weights to the samples in the dataset to show how the weights affect the decision function.

VM Tips

After the VM startup is done, click the top left corner to switch to the Notebook tab to access Jupyter Notebook for practice.

Sometimes, you may need to wait a few seconds for Jupyter Notebook to finish loading. The validation of operations cannot be automated because of limitations in Jupyter Notebook.

If you face issues during learning, feel free to ask Labby. Provide feedback after the session, and we will promptly resolve the problem for you.

Skills Graph

%%%%{init: {'theme':'neutral'}}%%%% flowchart RL ml(("`Machine Learning`")) -.-> ml/FrameworkandSoftwareGroup(["`Framework and Software`"]) ml/FrameworkandSoftwareGroup -.-> ml/sklearn("`scikit-learn`") subgraph Lab Skills ml/sklearn -.-> lab-49292{{"`Weighted Dataset Decision Function Plotting`"}} end

Import Required Libraries

We start by importing the necessary libraries for our project.

import numpy as np
import matplotlib.pyplot as plt
from sklearn import linear_model

Create a Weighted Dataset

We create a weighted dataset using the numpy library. We generate 20 points with random values and assign a bigger weight to the last 10 samples.

np.random.seed(0)
X = np.r_[np.random.randn(10, 2) + [1, 1], np.random.randn(10, 2)]
y = [1] * 10 + [-1] * 10
sample_weight = 100 * np.abs(np.random.randn(20))
sample_weight[:10] *= 10

Plot the Weighted Dataset

We plot the weighted dataset using the matplotlib library. The size of the points is proportional to its weight.

xx, yy = np.meshgrid(np.linspace(-4, 5, 500), np.linspace(-4, 5, 500))
fig, ax = plt.subplots()
ax.scatter(
    X[:, 0],
    X[:, 1],
    c=y,
    s=sample_weight,
    alpha=0.9,
    cmap=plt.cm.bone,
    edgecolor="black",
)

Fit the Unweighted Model

We fit an unweighted model using the SGDClassifier algorithm from the scikit-learn library. We then plot the decision function of the unweighted model.

clf = linear_model.SGDClassifier(alpha=0.01, max_iter=100)
clf.fit(X, y)
Z = clf.decision_function(np.c_[xx.ravel(), yy.ravel()])
Z = Z.reshape(xx.shape)
no_weights = ax.contour(xx, yy, Z, levels=[0], linestyles=["solid"])

Fit the Weighted Model

We fit a weighted model using the same algorithm as in Step 4, but this time we pass the sample_weight argument to the fit method. We then plot the decision function of the weighted model.

clf = linear_model.SGDClassifier(alpha=0.01, max_iter=100)
clf.fit(X, y, sample_weight=sample_weight)
Z = clf.decision_function(np.c_[xx.ravel(), yy.ravel()])
Z = Z.reshape(xx.shape)
samples_weights = ax.contour(xx, yy, Z, levels=[0], linestyles=["dashed"])

Add Legend and Display the Plot

We add a legend to the plot to differentiate between the unweighted and weighted models. We then display the plot.

no_weights_handles, _ = no_weights.legend_elements()
weights_handles, _ = samples_weights.legend_elements()
ax.legend(
    [no_weights_handles[0], weights_handles[0]],
    ["no weights", "with weights"],
    loc="lower left",
)

ax.set(xticks=(), yticks=())
plt.show()

Summary

In this tutorial, we learned how to plot a decision function of a weighted dataset using scikit-learn. We also learned how to assign different weights to the samples in the dataset to show how the weights affect the decision function.

Weighted Dataset Decision Function Plotting