Machine Learning Classifiers | LDA and QDA Tutorials

Introduction

Linear and Quadratic Discriminant Analysis (LDA and QDA) are two classic classifiers used in machine learning. LDA uses a linear decision surface, while QDA uses a quadratic decision surface. These classifiers are popular because they have closed-form solutions, work well in practice, and have no hyperparameters to tune.

In this lab, we will explore how to perform LDA and QDA using scikit-learn, a popular machine learning library in Python.

VM Tips

After the VM startup is done, click the top left corner to switch to the Notebook tab to access Jupyter Notebook for practice.

Sometimes, you may need to wait a few seconds for Jupyter Notebook to finish loading. The validation of operations cannot be automated because of limitations in Jupyter Notebook.

If you face issues during learning, feel free to ask Labby. Provide feedback after the session, and we will promptly resolve the problem for you.

This is a Guided Lab, which provides step-by-step instructions to help you learn and practice. Follow the instructions carefully to complete each step and gain hands-on experience. Historical data shows that this is a beginner level lab with a 81% completion rate. It has received a 95% positive review rate from learners.

Import the required libraries

First, we need to import the required libraries, including scikit-learn (sklearn) and matplotlib, which will be used for data visualization.

import numpy as np
import matplotlib.pyplot as plt
from sklearn.discriminant_analysis import LinearDiscriminantAnalysis, QuadraticDiscriminantAnalysis

Generate synthetic data

Next, we will generate synthetic data to demonstrate the difference between LDA and QDA. We will use the make_classification function from scikit-learn to create two classes with distinct patterns.

from sklearn.datasets import make_classification

## Generate synthetic data
X, y = make_classification(n_samples=100, n_features=2, n_informative=2, n_redundant=0, n_classes=2, random_state=1)

Train and visualize the classifiers

Now, we will train the LDA and QDA classifiers on the synthetic data and visualize the decision boundaries.

## Train the LDA classifier
lda = LinearDiscriminantAnalysis()
lda.fit(X, y)

## Train the QDA classifier
qda = QuadraticDiscriminantAnalysis()
qda.fit(X, y)

## Plot the decision boundaries
def plot_decision_boundary(classifier, title):
    h = 0.02  ## step size in the mesh
    x_min, x_max = X[:, 0].min() - 1, X[:, 0].max() + 1
    y_min, y_max = X[:, 1].min() - 1, X[:, 1].max() + 1
    xx, yy = np.meshgrid(np.arange(x_min, x_max, h), np.arange(y_min, y_max, h))
    Z = classifier.predict(np.c_[xx.ravel(), yy.ravel()])
    Z = Z.reshape(xx.shape)
    plt.contourf(xx, yy, Z, alpha=0.8)
    plt.scatter(X[:, 0], X[:, 1], c=y, edgecolors='k', cmap=plt.cm.Paired)
    plt.xlabel('Feature 1')
    plt.ylabel('Feature 2')
    plt.title(title)

plt.figure(figsize=(10, 4))

plt.subplot(1, 2, 1)
plot_decision_boundary(lda, 'Linear Discriminant Analysis')

plt.subplot(1, 2, 2)
plot_decision_boundary(qda, 'Quadratic Discriminant Analysis')

plt.tight_layout()
plt.show()

Perform dimensionality reduction using LDA

LDA can also be used for supervised dimensionality reduction. We will demonstrate this by reducing the dimension of the Iris dataset.

from sklearn.datasets import load_iris

## Load the Iris dataset
iris = load_iris()
X, y = iris.data, iris.target

## Perform dimensionality reduction using LDA
lda = LinearDiscriminantAnalysis(n_components=2)
X_lda = lda.fit_transform(X, y)

Summary

Linear and Quadratic Discriminant Analysis (LDA and QDA) are two classic classifiers used in machine learning. LDA uses a linear decision surface, while QDA uses a quadratic decision surface. These classifiers have closed-form solutions and perform well in practice. LDA can also be used for supervised dimensionality reduction.

Discriminant Analysis Classifiers Explained