Scikit-Learn Lasso Path

Machine LearningMachine LearningBeginner
Practice Now

This tutorial is from open-source community. Access the source code

Introduction

This lab will demonstrate how to compute the Lasso Path along the regularization parameter using the LARS algorithm on the diabetes dataset. The Lasso Path is a plot of the coefficients of a linear model as the L1 regularization parameter is increased. Each color represents a different feature of the coefficient vector, and this is displayed as a function of the regularization parameter.

VM Tips

After the VM startup is done, click the top left corner to switch to the Notebook tab to access Jupyter Notebook for practice.

Sometimes, you may need to wait a few seconds for Jupyter Notebook to finish loading. The validation of operations cannot be automated because of limitations in Jupyter Notebook.

If you face issues during learning, feel free to ask Labby. Provide feedback after the session, and we will promptly resolve the problem for you.


Skills Graph

%%%%{init: {'theme':'neutral'}}%%%% flowchart RL ml(("`Machine Learning`")) -.-> ml/FrameworkandSoftwareGroup(["`Framework and Software`"]) ml/FrameworkandSoftwareGroup -.-> ml/sklearn("`scikit-learn`") subgraph Lab Skills ml/sklearn -.-> lab-49191{{"`Scikit-Learn Lasso Path`"}} end

Load Data

The first step is to load the diabetes dataset from Scikit-Learn.

from sklearn import datasets

X, y = datasets.load_diabetes(return_X_y=True)

Compute Lasso Path

Next, we compute the Lasso Path using the LARS algorithm. The lars_path function from Scikit-Learn's linear_model module is used to compute the Lasso Path. The function takes the input features, target variable, and method as parameters. In this case, we use the "lasso" method for L1 regularization.

from sklearn import linear_model

_, _, coefs = linear_model.lars_path(X, y, method="lasso", verbose=True)

Plot Lasso Path

After computing the Lasso Path, we plot the results. The coefficients for each feature are plotted as a function of the regularization parameter.

import numpy as np
import matplotlib.pyplot as plt

xx = np.sum(np.abs(coefs.T), axis=1)
xx /= xx[-1]

plt.plot(xx, coefs.T)
ymin, ymax = plt.ylim()
plt.vlines(xx, ymin, ymax, linestyle="dashed")
plt.xlabel("|coef| / max|coef|")
plt.ylabel("Coefficients")
plt.title("LASSO Path")
plt.axis("tight")
plt.show()

Interpret Results

The resulting plot shows the Lasso Path for the diabetes dataset. Each color represents a different feature of the coefficient vector, and this is displayed as a function of the regularization parameter. As the regularization parameter increases, the coefficients for some features shrink towards zero, indicating that those features are less important for predicting the target variable.

Summary

In this lab, we demonstrated how to compute and plot the Lasso Path using the LARS algorithm on the diabetes dataset. The Lasso Path is a useful visualization for understanding the effect of L1 regularization on the coefficients of a linear model.

Other Machine Learning Tutorials you may like