Simple Handwritten Character Recognition Classifier

Machine LearningMachine LearningBeginner
Practice Now

Introduction

In this project, you will learn how to build a simple handwritten character recognition classifier using the DIGITS dataset provided by the scikit-learn library. Handwritten character recognition is a classic problem in machine learning, and this project will guide you through the process of creating a classifier that can accurately predict the digit represented in a handwritten character image.

๐ŸŽฏ Tasks

In this project, you will learn:

  • How to load the DIGITS dataset and split it into training and testing sets
  • How to create and train a Support Vector Machine (SVM) classifier on the training data
  • How to implement a function to classify a single handwritten character image
  • How to test the classifier with a sample handwritten character image

๐Ÿ† Achievements

After completing this project, you will be able to:

  • Load and preprocess a dataset for machine learning tasks
  • Create and train an SVM classifier using scikit-learn
  • Implement a prediction function to classify new samples
  • Understand the basics of handwritten character recognition using machine learning techniques

Skills Graph

%%%%{init: {'theme':'neutral'}}%%%% flowchart RL sklearn(("`Sklearn`")) -.-> sklearn/UtilitiesandDatasetsGroup(["`Utilities and Datasets`"]) sklearn(("`Sklearn`")) -.-> sklearn/ModelSelectionandEvaluationGroup(["`Model Selection and Evaluation`"]) sklearn(("`Sklearn`")) -.-> sklearn/CoreModelsandAlgorithmsGroup(["`Core Models and Algorithms`"]) ml(("`Machine Learning`")) -.-> ml/FrameworkandSoftwareGroup(["`Framework and Software`"]) numpy(("`NumPy`")) -.-> numpy/ArrayManipulationGroup(["`Array Manipulation`"]) python(("`Python`")) -.-> python/DataStructuresGroup(["`Data Structures`"]) python(("`Python`")) -.-> python/FunctionsGroup(["`Functions`"]) python(("`Python`")) -.-> python/DataScienceandMachineLearningGroup(["`Data Science and Machine Learning`"]) numpy(("`NumPy`")) -.-> numpy/ArrayBasicsGroup(["`Array Basics`"]) sklearn/UtilitiesandDatasetsGroup -.-> sklearn/datasets("`Datasets`") sklearn/ModelSelectionandEvaluationGroup -.-> sklearn/model_selection("`Model Selection`") sklearn/CoreModelsandAlgorithmsGroup -.-> sklearn/svm("`Support Vector Machines`") ml/FrameworkandSoftwareGroup -.-> ml/sklearn("`scikit-learn`") numpy/ArrayManipulationGroup -.-> numpy/reshape("`Reshape`") python/DataStructuresGroup -.-> python/lists("`Lists`") python/FunctionsGroup -.-> python/function_definition("`Function Definition`") python/DataScienceandMachineLearningGroup -.-> python/machine_learning("`Machine Learning`") numpy/ArrayBasicsGroup -.-> numpy/1d_array("`1D Array Creation`") subgraph Lab Skills sklearn/datasets -.-> lab-300258{{"`Simple Handwritten Character Recognition Classifier`"}} sklearn/model_selection -.-> lab-300258{{"`Simple Handwritten Character Recognition Classifier`"}} sklearn/svm -.-> lab-300258{{"`Simple Handwritten Character Recognition Classifier`"}} ml/sklearn -.-> lab-300258{{"`Simple Handwritten Character Recognition Classifier`"}} numpy/reshape -.-> lab-300258{{"`Simple Handwritten Character Recognition Classifier`"}} python/lists -.-> lab-300258{{"`Simple Handwritten Character Recognition Classifier`"}} python/function_definition -.-> lab-300258{{"`Simple Handwritten Character Recognition Classifier`"}} python/machine_learning -.-> lab-300258{{"`Simple Handwritten Character Recognition Classifier`"}} numpy/1d_array -.-> lab-300258{{"`Simple Handwritten Character Recognition Classifier`"}} end

Load the Digits Dataset

In this step, you will learn how to load the DIGITS dataset from the scikit-learn library. Follow the steps below to complete this step:

Open the handwritten_digit_classifier.py file, import the necessary libraries:

from sklearn import datasets
from sklearn.model_selection import train_test_split

Load the DIGITS dataset using the datasets.load_digits() function:

digits = datasets.load_digits()
X, y = digits.data, digits.target

The X variable contains the flattened 8x8 pixel images, and the y variable contains the corresponding digit labels (0-9).

Split the dataset into training and testing sets using train_test_split():

X_train, X_test, y_train, y_test = train_test_split(
    X, y, test_size=0.2, random_state=42
)

This will split the data into 80% training and 20% testing sets.

Create and Train the SVM Classifier

In this step, you will learn how to create and train a Support Vector Machine (SVM) classifier on the training data. Follow the steps below to complete this step:

Import the SVC class from the sklearn.svm module in the handwritten_digit_classifier.py file:

from sklearn.svm import SVC

Create an SVM classifier with a linear kernel and a regularization parameter of 1:

clf = SVC(kernel="linear", C=1)

Train the SVM classifier on the training data using the fit() method:

clf.fit(X_train, y_train)

This will train the SVM classifier on the training data.

Implement the Prediction Function

In this step, you will implement the predict(sample) function to classify a single handwritten character image. Follow the steps below to complete this step:

Import the numpy module in the handwritten_digit_classifier.py file:

import numpy as np

Define the predict(sample) function:

def predict(sample):
    """
    Parameters:
    sample -- A list of pixel values of a handwritten character image

    Returns:
    pred -- The predicted label for the handwritten character image as an integer
    """
    ## Reshape the input sample
    sample = np.array(sample).reshape(1, -1)

    ## Use the trained classifier to make a prediction
    pred = clf.predict(sample)

    return int(pred[0])

In the predict(sample) function:

  • Convert the input sample list into a NumPy array and reshape it to have a single sample with the same format as the training data.
  • Use the trained clf classifier to predict the label for the reshaped input sample using the predict() method.
  • Return the predicted label as an integer.

Test the Classifier

You can now test the predict(sample) function with a sample handwritten character image. Here's an example in the handwritten_digit_classifier.py file:

sample = [
    0.0, 0.0, 6.0, 14.0, 4.0, 0.0, 0.0, 0.0,
    0.0, 0.0, 11.0, 16.0, 10.0, 0.0, 0.0, 0.0,
    0.0, 0.0, 8.0, 14.0, 16.0, 2.0, 0.0, 0.0,
    0.0, 0.0, 1.0, 12.0, 12.0, 11.0, 0.0, 0.0,
    0.0, 0.0, 0.0, 0.0, 11.0, 3.0, 0.0, 0.0,
    0.0, 0.0, 0.0, 0.0, 5.0, 11.0, 0.0, 0.0,
    0.0, 1.0, 4.0, 4.0, 7.0, 16.0, 2.0, 0.0,
    0.0, 7.0, 16.0, 16.0, 13.0, 11.0, 1.0, 0.0
]

result = predict(sample)
print("Predicted Label:", result)

This should output the predicted label for the given handwritten character image.

Run the handwritten_digit_classifier.py file to execute the example:

python handwritten_digit_classifier.py
## Predicted Label: 9

Summary

Congratulations! You have completed this project. You can practice more labs in LabEx to improve your skills.

Other Machine Learning Tutorials you may like