Introduction
Matplotlib is a popular data visualization library in Python. One of the most common ways to visualize data distributions is by using histograms. In this lab, we will learn how to create histograms with Matplotlib and explore different customization options.
VM Tips
After the VM startup is done, click the top left corner to switch to the Notebook tab to access Jupyter Notebook for practice.
Sometimes, you may need to wait a few seconds for Jupyter Notebook to finish loading. The validation of operations cannot be automated because of limitations in Jupyter Notebook.
If you face issues during learning, feel free to ask Labby. Provide feedback after the session, and we will promptly resolve the problem for you.
Import the necessary libraries
First, we need to import the necessary libraries, including Matplotlib and NumPy.
import matplotlib.pyplot as plt
import numpy as np
Generate sample data
Next, we will generate some sample data to use for the histogram. In this example, we will generate three sets of random data.
np.random.seed(19680801)
n_bins = 10
x = np.random.randn(1000, 3)
Plot a basic histogram
We can create a basic histogram using the hist function in Matplotlib. This function takes in the data we want to plot and the number of bins we want to use.
plt.hist(x, n_bins)
plt.show()
Add labels and a title
We can add labels to the x and y axes and a title to the plot using the xlabel, ylabel, and title functions.
plt.hist(x, n_bins)
plt.xlabel('Value')
plt.ylabel('Frequency')
plt.title('Histogram of Random Data')
plt.show()
Customize the histogram
We can customize the histogram by changing the color, transparency, and edge color of the bars using the color, alpha, and edgecolor parameters.
plt.hist(x, n_bins, color='green', alpha=0.5, edgecolor='black')
plt.xlabel('Value')
plt.ylabel('Frequency')
plt.title('Histogram of Random Data')
plt.show()
Plot multiple histograms
We can plot multiple histograms on the same plot by passing in an array of data to the hist function.
plt.hist(x, n_bins, color='green', alpha=0.5, edgecolor='black', label=['Sample 1', 'Sample 2', 'Sample 3'])
plt.xlabel('Value')
plt.ylabel('Frequency')
plt.title('Histogram of Random Data')
plt.legend()
plt.show()
Plot stacked histograms
We can plot stacked histograms by setting the stacked parameter to True.
plt.hist(x, n_bins, color=['green', 'blue', 'red'], alpha=0.5, edgecolor='black', label=['Sample 1', 'Sample 2', 'Sample 3'], stacked=True)
plt.xlabel('Value')
plt.ylabel('Frequency')
plt.title('Stacked Histogram of Random Data')
plt.legend()
plt.show()
Plot step histograms
We can plot step histograms by setting the histtype parameter to 'step'.
plt.hist(x, n_bins, histtype='step', color=['green', 'blue', 'red'], label=['Sample 1', 'Sample 2', 'Sample 3'])
plt.xlabel('Value')
plt.ylabel('Frequency')
plt.title('Step Histogram of Random Data')
plt.legend()
plt.show()
Summary
In this lab, we learned how to create histograms using Matplotlib. We explored different customization options, including changing the color, transparency, and edge color of the bars, plotting multiple histograms on the same plot, stacking histograms, and plotting step histograms. These tools can help us to better understand the distribution of our data.