Introduction
In this lab, we will explore linear models in scikit-learn. Linear models are a set of methods used for regression and classification tasks. They assume that the target variable is a linear combination of the features. These models are widely used in machine learning due to their simplicity and interpretability.
We will cover the following topics:
- Ordinary Least Squares
- Ridge Regression
- Lasso
- Logistic Regression
- Stochastic Gradient Descent
- Perceptron
Get started with Supervised Learning: Regression, if you don't have any prior experience with Machine Learning.
VM Tips
After the VM startup is done, click the top left corner to switch to the Notebook tab to access Jupyter Notebook for practice.
Sometimes, you may need to wait a few seconds for Jupyter Notebook to finish loading. The validation of operations cannot be automated because of limitations in Jupyter Notebook.
If you face issues during learning, feel free to ask Labby. Provide feedback after the session, and we will promptly resolve the problem for you.
Ordinary Least Squares
Get started with Supervised Learning: Regression, if you don't have any prior experience with Machine Learning.
Ordinary Least Squares (OLS) is a linear regression method that minimizes the sum of squared differences between the observed targets and the predicted targets. Mathematically, it solves a problem of the form: $$\min_{w} || X w - y||_2^2$$
Let's start by fitting a linear regression model using OLS.
from sklearn import linear_model
reg = linear_model.LinearRegression()
X = [[0, 0], [1, 1], [2, 2]]
y = [0, 1, 2]
reg.fit(X, y)
print(reg.coef_)
- We import the
linear_modelmodule from scikit-learn. - We create an instance of
LinearRegression. - We use the
fitmethod to fit the model to the training data. - We print the coefficients of the linear model.
Ridge Regression
Ridge regression is a linear regression method that adds a penalty term to the ordinary least squares objective function. This penalty term helps to reduce overfitting by shrinking the coefficients towards zero. The complexity of the model can be controlled by the regularization parameter.
Let's fit a ridge regression model.
reg = linear_model.Ridge(alpha=0.5)
reg.fit([[0, 0], [0, 0], [1, 1]], [0, 0.1, 1])
print(reg.coef_)
- We create an instance of
Ridgewith the regularization parameteralphaset to 0.5. - We use the
fitmethod to fit the model to the training data. - We print the coefficients of the ridge regression model.
Lasso
Lasso is a linear regression method that adds a penalty term to the ordinary least squares objective function. The penalty term has the effect of setting some coefficients to exactly zero, thus performing feature selection. Lasso can be used for sparse model estimation.
Let's fit a lasso model.
reg = linear_model.Lasso(alpha=0.1)
reg.fit([[0, 0], [1, 1]], [0, 1])
print(reg.coef_)
- We create an instance of
Lassowith the regularization parameteralphaset to 0.1. - We use the
fitmethod to fit the model to the training data. - We print the coefficients of the lasso model.
Logistic Regression
Logistic regression is a classification method that estimates the probabilities of the possible outcomes using a logistic function. It is commonly used for binary classification tasks. Logistic regression can also be extended to handle multi-class classification problems.
Let's fit a logistic regression model.
clf = linear_model.LogisticRegression(random_state=0).fit(X, y)
print(clf.coef_)
- We create an instance of
LogisticRegressionwith therandom_stateparameter set to 0. - We use the
fitmethod to fit the model to the training data. - We print the coefficients of the logistic regression model.
Stochastic Gradient Descent (SGD)
Stochastic Gradient Descent (SGD) is a simple yet efficient approach for training linear models. It is particularly useful when the number of samples and features is very large. SGD updates the model parameters using a small subset of the training data at each iteration, which makes it suitable for online learning and out-of-core learning.
Let's fit a logistic regression model using SGD.
clf = linear_model.SGDClassifier(loss="log_loss", max_iter=1000)
clf.fit(X, y)
print(clf.coef_)
- We create an instance of
SGDClassifierwith thelossparameter set to "log_loss" to perform logistic regression. - We use the
fitmethod to fit the model to the training data. - We print the coefficients of the logistic regression model obtained using SGD.
Perceptron
The Perceptron is a simple linear classification algorithm suitable for large-scale learning. It updates its model only on mistakes, making it faster to train than the stochastic gradient descent (SGD) with hinge loss. The resulting models are also sparser.
Let's fit a perceptron model.
clf = linear_model.Perceptron(alpha=0.1)
clf.fit(X, y)
print(clf.coef_)
- We create an instance of
Perceptronwith the regularization parameteralphaset to 0.1. - We use the
fitmethod to fit the model to the training data. - We print the coefficients of the perceptron model.
Summary
In this lab, we explored linear models in scikit-learn. We learned about ordinary least squares, ridge regression, lasso, logistic regression, stochastic gradient descent, and perceptron. These models can be used for both regression and classification tasks. We also saw how to fit these models using various algorithms and techniques such as online learning and feature selection.