Tuning Hyperparameters of an Estimator

Machine LearningMachine LearningBeginner
Practice Now

This tutorial is from open-source community. Access the source code

Introduction

Hyperparameters are parameters that are not directly learned by an estimator. They are passed as arguments to the constructor of the estimator classes. Tuning the hyperparameters of an estimator is an important step in building effective machine learning models. It involves finding the optimal combination of hyperparameters that result in the best performance of the model.

Scikit-learn provides several tools to search for the best hyperparameters: GridSearchCV and RandomizedSearchCV. In this lab, we will walk through the process of tuning hyperparameters using these tools.

VM Tips

After the VM startup is done, click the top left corner to switch to the Notebook tab to access Jupyter Notebook for practice.

Sometimes, you may need to wait a few seconds for Jupyter Notebook to finish loading. The validation of operations cannot be automated because of limitations in Jupyter Notebook.

If you face issues during learning, feel free to ask Labby. Provide feedback after the session, and we will promptly resolve the problem for you.


Skills Graph

%%%%{init: {'theme':'neutral'}}%%%% flowchart RL sklearn(("`Sklearn`")) -.-> sklearn/UtilitiesandDatasetsGroup(["`Utilities and Datasets`"]) sklearn(("`Sklearn`")) -.-> sklearn/ModelSelectionandEvaluationGroup(["`Model Selection and Evaluation`"]) sklearn(("`Sklearn`")) -.-> sklearn/CoreModelsandAlgorithmsGroup(["`Core Models and Algorithms`"]) ml(("`Machine Learning`")) -.-> ml/FrameworkandSoftwareGroup(["`Framework and Software`"]) sklearn/UtilitiesandDatasetsGroup -.-> sklearn/datasets("`Datasets`") sklearn/ModelSelectionandEvaluationGroup -.-> sklearn/model_selection("`Model Selection`") sklearn/CoreModelsandAlgorithmsGroup -.-> sklearn/svm("`Support Vector Machines`") ml/FrameworkandSoftwareGroup -.-> ml/sklearn("`scikit-learn`") subgraph Lab Skills sklearn/datasets -.-> lab-71123{{"`Tuning Hyperparameters of an Estimator`"}} sklearn/model_selection -.-> lab-71123{{"`Tuning Hyperparameters of an Estimator`"}} sklearn/svm -.-> lab-71123{{"`Tuning Hyperparameters of an Estimator`"}} ml/sklearn -.-> lab-71123{{"`Tuning Hyperparameters of an Estimator`"}} end

Import the necessary libraries

First, we need to import the necessary libraries for our analysis. We will be using sklearn.model_selection to perform the hyperparameter tuning.

import numpy as np
from sklearn.model_selection import GridSearchCV, RandomizedSearchCV

Load the dataset

Next, let's load the dataset that we will be working with. We can use any dataset of our choice for this exercise.

from sklearn.datasets import load_iris

## Load the iris dataset
iris = load_iris()

## Split the data into features and target
X = iris.data
y = iris.target

Define the estimator and parameter grid

Now we need to define the estimator that we want to tune and the parameter grid that we want to search. The parameter grid specifies the values that we want to try for each hyperparameter.

from sklearn.svm import SVC

## Create an instance of the support vector classifier
svc = SVC()

## Define the parameter grid
param_grid = {'C': [0.1, 1, 10, 100], 'gamma': [0.1, 0.01, 0.001], 'kernel': ['linear', 'rbf']}

Grid search exhaustively searches through all possible combinations of hyperparameters in the specified parameter grid. It evaluates the performance of each combination using cross-validation.

## Create an instance of GridSearchCV
grid_search = GridSearchCV(svc, param_grid, cv=5)

## Fit the data to perform grid search
grid_search.fit(X, y)

## Print the best combination of hyperparameters
print('Best hyperparameters:', grid_search.best_params_)

Randomized search randomly samples a subset of the parameter grid and evaluates the performance of each combination using cross-validation. It is useful when the parameter space is large and searching exhaustively is not feasible.

## Create an instance of RandomizedSearchCV
random_search = RandomizedSearchCV(svc, param_grid, cv=5, n_iter=10, random_state=0)

## Fit the data to perform randomized search
random_search.fit(X, y)

## Print the best combination of hyperparameters
print('Best hyperparameters:', random_search.best_params_)

Summary

In this lab, we learned how to tune the hyperparameters of an estimator using GridSearchCV and RandomizedSearchCV. We defined the estimator and the parameter grid, and then performed grid search and randomized search, respectively, to find the best combination of hyperparameters. Hyperparameter tuning is an important step in building machine learning models to improve their performance.

Other Machine Learning Tutorials you may like