Plot Pca vs Lda

Machine LearningMachine LearningBeginner
Practice Now

This tutorial is from open-source community. Access the source code

Introduction

In this lab, we will compare the performance of two popular dimensionality reduction algorithms, Principal Component Analysis (PCA) and Linear Discriminant Analysis (LDA), on the Iris dataset. The Iris dataset contains 3 types of Iris flowers with 4 attributes: sepal length, sepal width, petal length, and petal width.

VM Tips

After the VM startup is done, click the top left corner to switch to the Notebook tab to access Jupyter Notebook for practice.

Sometimes, you may need to wait a few seconds for Jupyter Notebook to finish loading. The validation of operations cannot be automated because of limitations in Jupyter Notebook.

If you face issues during learning, feel free to ask Labby. Provide feedback after the session, and we will promptly resolve the problem for you.


Skills Graph

%%%%{init: {'theme':'neutral'}}%%%% flowchart RL sklearn(("`Sklearn`")) -.-> sklearn/CoreModelsandAlgorithmsGroup(["`Core Models and Algorithms`"]) sklearn(("`Sklearn`")) -.-> sklearn/AdvancedDataAnalysisandDimensionalityReductionGroup(["`Advanced Data Analysis and Dimensionality Reduction`"]) ml(("`Machine Learning`")) -.-> ml/FrameworkandSoftwareGroup(["`Framework and Software`"]) sklearn/CoreModelsandAlgorithmsGroup -.-> sklearn/discriminant_analysis("`Discriminant Analysis`") sklearn/AdvancedDataAnalysisandDimensionalityReductionGroup -.-> sklearn/decomposition("`Matrix Decomposition`") ml/FrameworkandSoftwareGroup -.-> ml/sklearn("`scikit-learn`") subgraph Lab Skills sklearn/discriminant_analysis -.-> lab-49242{{"`Plot Pca vs Lda`"}} sklearn/decomposition -.-> lab-49242{{"`Plot Pca vs Lda`"}} ml/sklearn -.-> lab-49242{{"`Plot Pca vs Lda`"}} end

Load the Dataset

First, we need to load the Iris dataset using scikit-learn's built-in load_iris() function.

import matplotlib.pyplot as plt
from sklearn import datasets

iris = datasets.load_iris()

X = iris.data
y = iris.target
target_names = iris.target_names

Perform PCA

Next, we will perform Principal Component Analysis (PCA) on the dataset to identify the combination of attributes that account for the most variance in the data. We will plot the different samples on the first two principal components.

from sklearn.decomposition import PCA

pca = PCA(n_components=2)
X_r = pca.fit(X).transform(X)

## Percentage of variance explained for each component
print("Explained variance ratio (first two components): %s" % str(pca.explained_variance_ratio_))

plt.figure()
colors = ["navy", "turquoise", "darkorange"]
lw = 2

for color, i, target_name in zip(colors, [0, 1, 2], target_names):
    plt.scatter(X_r[y == i, 0], X_r[y == i, 1], color=color, alpha=0.8, lw=lw, label=target_name)

plt.legend(loc="best", shadow=False, scatterpoints=1)
plt.title("PCA of Iris Dataset")
plt.show()

Perform LDA

Now, we will perform Linear Discriminant Analysis (LDA) on the dataset to identify attributes that account for the most variance between classes. Unlike PCA, LDA is a supervised method that uses known class labels.

from sklearn.discriminant_analysis import LinearDiscriminantAnalysis

lda = LinearDiscriminantAnalysis(n_components=2)
X_r2 = lda.fit(X, y).transform(X)

plt.figure()
for color, i, target_name in zip(colors, [0, 1, 2], target_names):
    plt.scatter(X_r2[y == i, 0], X_r2[y == i, 1], alpha=0.8, color=color, label=target_name)

plt.legend(loc="best", shadow=False, scatterpoints=1)
plt.title("LDA of Iris Dataset")
plt.show()

Compare Results

Finally, we will compare the results of PCA and LDA. We can see that LDA performs better than PCA in separating the three classes in the Iris dataset.

Summary

In this lab, we learned how to perform Principal Component Analysis (PCA) and Linear Discriminant Analysis (LDA) on the Iris dataset using scikit-learn. We also compared the performance of these two dimensionality reduction algorithms and found that LDA performs better than PCA in separating the different classes in the dataset.

Other Machine Learning Tutorials you may like