Recursive Feature Elimination

Machine LearningMachine LearningBeginner
Practice Now

This tutorial is from open-source community. Access the source code

Introduction

In this lab, we will learn how to use Recursive Feature Elimination (RFE) for feature selection. We will be using the Scikit-Learn library in Python to perform this task. Feature selection is an important step in machine learning to improve model performance by removing irrelevant or redundant features.

VM Tips

After the VM startup is done, click the top left corner to switch to the Notebook tab to access Jupyter Notebook for practice.

Sometimes, you may need to wait a few seconds for Jupyter Notebook to finish loading. The validation of operations cannot be automated because of limitations in Jupyter Notebook.

If you face issues during learning, feel free to ask Labby. Provide feedback after the session, and we will promptly resolve the problem for you.


Skills Graph

%%%%{init: {'theme':'neutral'}}%%%% flowchart RL sklearn(("`Sklearn`")) -.-> sklearn/DataPreprocessingandFeatureEngineeringGroup(["`Data Preprocessing and Feature Engineering`"]) sklearn(("`Sklearn`")) -.-> sklearn/UtilitiesandDatasetsGroup(["`Utilities and Datasets`"]) sklearn(("`Sklearn`")) -.-> sklearn/CoreModelsandAlgorithmsGroup(["`Core Models and Algorithms`"]) ml(("`Machine Learning`")) -.-> ml/FrameworkandSoftwareGroup(["`Framework and Software`"]) sklearn/DataPreprocessingandFeatureEngineeringGroup -.-> sklearn/feature_selection("`Feature Selection`") sklearn/UtilitiesandDatasetsGroup -.-> sklearn/datasets("`Datasets`") sklearn/CoreModelsandAlgorithmsGroup -.-> sklearn/svm("`Support Vector Machines`") ml/FrameworkandSoftwareGroup -.-> ml/sklearn("`scikit-learn`") subgraph Lab Skills sklearn/feature_selection -.-> lab-49267{{"`Recursive Feature Elimination`"}} sklearn/datasets -.-> lab-49267{{"`Recursive Feature Elimination`"}} sklearn/svm -.-> lab-49267{{"`Recursive Feature Elimination`"}} ml/sklearn -.-> lab-49267{{"`Recursive Feature Elimination`"}} end

Load the Dataset and Split the Data

First, we will load the digits dataset using the Scikit-Learn library. This dataset consists of 8x8 images of digits from 0 to 9. Each image is represented as an array of 64 features. We will split the data into features and target variables.

from sklearn.datasets import load_digits
digits = load_digits()
X = digits.images.reshape((len(digits.images), -1))
y = digits.target

Create the RFE Object and Fit the Data

Next, we will create an object of the RFE class and fit the data to it. We will use a Support Vector Classifier (SVC) with a linear kernel as the estimator. We will select one feature at a time and take one step at a time.

from sklearn.svm import SVC
from sklearn.feature_selection import RFE

svc = SVC(kernel="linear", C=1)
rfe = RFE(estimator=svc, n_features_to_select=1, step=1)
rfe.fit(X, y)

Rank the Features

After fitting the data to the RFE object, we can rank the features based on their importance. We will use the ranking_ attribute of the RFE object to get the feature rankings. We will also reshape the rankings to match the shape of the original images.

ranking = rfe.ranking_.reshape(digits.images[0].shape)

Visualize the Feature Rankings

Finally, we will plot the feature rankings using the Matplotlib library. We will use the matshow() function to display the rankings as an image. We will also add a color bar and a title to the plot.

import matplotlib.pyplot as plt

plt.matshow(ranking, cmap=plt.cm.Blues)
plt.colorbar()
plt.title("Ranking of pixels with RFE")
plt.show()

Summary

In this lab, we learned how to use Recursive Feature Elimination (RFE) for feature selection. We used the Scikit-Learn library in Python to load the digits dataset, create an RFE object, fit the data, rank the features, and visualize the feature rankings.

Other Machine Learning Tutorials you may like