Introduction
In this project, you will learn how to build a simple handwritten character recognition classifier using the DIGITS dataset provided by the scikit-learn library. Handwritten character recognition is a classic problem in machine learning, and this project will guide you through the process of creating a classifier that can accurately predict the digit represented in a handwritten character image.
🎯 Tasks
In this project, you will learn:
- How to load the DIGITS dataset and split it into training and testing sets
- How to create and train a Support Vector Machine (SVM) classifier on the training data
- How to implement a function to classify a single handwritten character image
- How to test the classifier with a sample handwritten character image
🏆 Achievements
After completing this project, you will be able to:
- Load and preprocess a dataset for machine learning tasks
- Create and train an SVM classifier using scikit-learn
- Implement a prediction function to classify new samples
- Understand the basics of handwritten character recognition using machine learning techniques
Load the Digits Dataset
In this step, you will learn how to load the DIGITS dataset from the scikit-learn library. Follow the steps below to complete this step:
Open the handwritten_digit_classifier.py file, import the necessary libraries:
from sklearn import datasets
from sklearn.model_selection import train_test_split
Load the DIGITS dataset using the datasets.load_digits() function:
digits = datasets.load_digits()
X, y = digits.data, digits.target
The X variable contains the flattened 8x8 pixel images, and the y variable contains the corresponding digit labels (0-9).
Split the dataset into training and testing sets using train_test_split():
X_train, X_test, y_train, y_test = train_test_split(
X, y, test_size=0.2, random_state=42
)
This will split the data into 80% training and 20% testing sets.
Create and Train the SVM Classifier
In this step, you will learn how to create and train a Support Vector Machine (SVM) classifier on the training data. Follow the steps below to complete this step:
Import the SVC class from the sklearn.svm module in the handwritten_digit_classifier.py file:
from sklearn.svm import SVC
Create an SVM classifier with a linear kernel and a regularization parameter of 1:
clf = SVC(kernel="linear", C=1)
Train the SVM classifier on the training data using the fit() method:
clf.fit(X_train, y_train)
This will train the SVM classifier on the training data.
Implement the Prediction Function
In this step, you will implement the predict(sample) function to classify a single handwritten character image. Follow the steps below to complete this step:
Import the numpy module in the handwritten_digit_classifier.py file:
import numpy as np
Define the predict(sample) function:
def predict(sample):
"""
Parameters:
sample -- A list of pixel values of a handwritten character image
Returns:
pred -- The predicted label for the handwritten character image as an integer
"""
## Reshape the input sample
sample = np.array(sample).reshape(1, -1)
## Use the trained classifier to make a prediction
pred = clf.predict(sample)
return int(pred[0])
In the predict(sample) function:
- Convert the input
samplelist into a NumPy array and reshape it to have a single sample with the same format as the training data. - Use the trained
clfclassifier to predict the label for the reshaped input sample using thepredict()method. - Return the predicted label as an integer.
Test the Classifier
You can now test the predict(sample) function with a sample handwritten character image. Here's an example in the handwritten_digit_classifier.py file:
sample = [
0.0, 0.0, 6.0, 14.0, 4.0, 0.0, 0.0, 0.0,
0.0, 0.0, 11.0, 16.0, 10.0, 0.0, 0.0, 0.0,
0.0, 0.0, 8.0, 14.0, 16.0, 2.0, 0.0, 0.0,
0.0, 0.0, 1.0, 12.0, 12.0, 11.0, 0.0, 0.0,
0.0, 0.0, 0.0, 0.0, 11.0, 3.0, 0.0, 0.0,
0.0, 0.0, 0.0, 0.0, 5.0, 11.0, 0.0, 0.0,
0.0, 1.0, 4.0, 4.0, 7.0, 16.0, 2.0, 0.0,
0.0, 7.0, 16.0, 16.0, 13.0, 11.0, 1.0, 0.0
]
result = predict(sample)
print("Predicted Label:", result)
This should output the predicted label for the given handwritten character image.
Run the handwritten_digit_classifier.py file to execute the example:
python handwritten_digit_classifier.py
## Predicted Label: 9
Summary
Congratulations! You have completed this project. You can practice more labs in LabEx to improve your skills.



