Diabetes Prediction Using Voting Regressor

Machine LearningMachine LearningBeginner
Practice Now

This tutorial is from open-source community. Access the source code

Introduction

In this lab, we will use a Voting Regressor to predict the progression of diabetes in patients. We will use three different regressors to predict the data: Gradient Boosting Regressor, Random Forest Regressor, and Linear Regression. Then the above 3 regressors will be used for the Voting Regressor. Finally, we will plot the predictions made by all models for comparison.

We will work with the diabetes dataset which consists of 10 features collected from a cohort of diabetes patients. The target is a quantitative measure of disease progression one year after baseline.

VM Tips

After the VM startup is done, click the top left corner to switch to the Notebook tab to access Jupyter Notebook for practice.

Sometimes, you may need to wait a few seconds for Jupyter Notebook to finish loading. The validation of operations cannot be automated because of limitations in Jupyter Notebook.

If you face issues during learning, feel free to ask Labby. Provide feedback after the session, and we will promptly resolve the problem for you.


Skills Graph

%%%%{init: {'theme':'neutral'}}%%%% flowchart RL sklearn(("`Sklearn`")) -.-> sklearn/UtilitiesandDatasetsGroup(["`Utilities and Datasets`"]) sklearn(("`Sklearn`")) -.-> sklearn/CoreModelsandAlgorithmsGroup(["`Core Models and Algorithms`"]) ml(("`Machine Learning`")) -.-> ml/FrameworkandSoftwareGroup(["`Framework and Software`"]) sklearn/UtilitiesandDatasetsGroup -.-> sklearn/datasets("`Datasets`") sklearn/CoreModelsandAlgorithmsGroup -.-> sklearn/ensemble("`Ensemble Methods`") sklearn/CoreModelsandAlgorithmsGroup -.-> sklearn/linear_model("`Linear Models`") ml/FrameworkandSoftwareGroup -.-> ml/sklearn("`scikit-learn`") subgraph Lab Skills sklearn/datasets -.-> lab-49330{{"`Diabetes Prediction Using Voting Regressor`"}} sklearn/ensemble -.-> lab-49330{{"`Diabetes Prediction Using Voting Regressor`"}} sklearn/linear_model -.-> lab-49330{{"`Diabetes Prediction Using Voting Regressor`"}} ml/sklearn -.-> lab-49330{{"`Diabetes Prediction Using Voting Regressor`"}} end

Import Libraries

Let's import the necessary libraries to perform the diabetes prediction using the Voting Regressor.

import matplotlib.pyplot as plt
from sklearn.datasets import load_diabetes
from sklearn.ensemble import GradientBoostingRegressor
from sklearn.ensemble import RandomForestRegressor
from sklearn.linear_model import LinearRegression
from sklearn.ensemble import VotingRegressor

Load the Diabetes Dataset

Next, we will load the diabetes dataset into our program using the load_diabetes() function provided by scikit-learn. This function returns the dataset as a tuple of two arrays - one containing the feature data and the other containing the target data. We will assign these arrays to X and y, respectively.

## Load the diabetes dataset
X, y = load_diabetes(return_X_y=True)

Train the Regressors

Now, let's initiate a Gradient Boosting Regressor, a Random Forest Regressor, and a Linear Regression. Next, we will use the 3 regressors to build the Voting Regressor.

## Train classifiers
reg1 = GradientBoostingRegressor(random_state=1)
reg2 = RandomForestRegressor(random_state=1)
reg3 = LinearRegression()

reg1.fit(X, y)
reg2.fit(X, y)
reg3.fit(X, y)

ereg = VotingRegressor([("gb", reg1), ("rf", reg2), ("lr", reg3)])
ereg.fit(X, y)

Making Predictions

Now we will use each of the regressors to make the 20 first predictions.

## Make predictions
xt = X[:20]

pred1 = reg1.predict(xt)
pred2 = reg2.predict(xt)
pred3 = reg3.predict(xt)
pred4 = ereg.predict(xt)

Plot the Results

Finally, we will visualize the 20 predictions. The red stars show the average prediction made by Voting Regressor.

## Plot the results
plt.figure()
plt.plot(pred1, "gd", label="GradientBoostingRegressor")
plt.plot(pred2, "b^", label="RandomForestRegressor")
plt.plot(pred3, "ys", label="LinearRegression")
plt.plot(pred4, "r*", ms=10, label="VotingRegressor")

plt.tick_params(axis="x", which="both", bottom=False, top=False, labelbottom=False)
plt.ylabel("predicted")
plt.xlabel("training samples")
plt.legend(loc="best")
plt.title("Regressor predictions and their average")

plt.show()

Summary

In this lab, we have used the Voting Regressor to predict the progression of diabetes in patients. We have used three different regressors to predict the data: Gradient Boosting Regressor, Random Forest Regressor, and Linear Regression. We have also visualized the predictions made by all models for comparison.

Other Machine Learning Tutorials you may like