Introduction
In this project, you will learn how to classify the iris dataset using a Support Vector Classifier (SVC) model. The iris dataset is a classic machine learning dataset that contains information about different species of irises, including their sepal length, sepal width, petal length, and petal width.
🎯 Tasks
In this project, you will learn:
- How to import the required libraries and load the iris dataset
- How to split the dataset into training and testing sets
- How to create and train a Support Vector Classifier model
- How to make predictions using the trained model
- How to evaluate the model's performance using accuracy score and classification report
🏆 Achievements
After completing this project, you will be able to:
- Use the scikit-learn library to work with the iris dataset
- Split a dataset into training and testing sets
- Create and train a Support Vector Classifier model
- Make predictions using a trained model
- Evaluate a model's performance using accuracy score and classification report
Import Required Libraries and Load Dataset
In this step, you will learn how to import the required libraries and load the iris dataset. Follow the steps below to complete this step:
In iris_classification_svm.py, import the required libraries, including those for loading the dataset, splitting the data, creating the SVM model, and evaluating its performance.
from sklearn.datasets import load_iris
from sklearn.model_selection import train_test_split
from sklearn.svm import SVC
from sklearn.metrics import accuracy_score, classification_report
Load the iris data from sklearn.datasets and split the dataset into training and testing sets. The dataset is split using an 80-20 ratio for training and testing, with a random seed of 42 for reproducibility.
## Continue in the same file
def load_and_split_data() -> tuple:
"""
Returns:
tuple: [X_train, X_test, y_train, y_test]
"""
iris = load_iris()
X, y = iris.data, iris.target
X_train, X_test, y_train, y_test = train_test_split(
X, y, test_size=0.2, random_state=42
)
return X_train, X_test, y_train, y_test
This code loads the Iris dataset and split it into training and testing sets for machine learning purposes. Here's a breakdown of each part:
- Importing necessary libraries:
sklearn.datasetsis used to load datasets, including the Iris dataset.sklearn.model_selectionprovides utilities for splitting datasets into training and testing sets.sklearn.svmcontains classes for Support Vector Machines (SVM), a type of machine learning algorithm.sklearn.metricsincludes tools for evaluating the performance of models, such as accuracy and classification reports.
- Function Definition: A function named
load_and_split_datais defined. This function does the following tasks:- Loads the Iris dataset:
load_iris()is a function provided bysklearn.datasetsthat loads the Iris flower dataset, which is a popular dataset for classification tasks. It contains measurements of 150 iris flowers from three different species. - Data Separation: The dataset is separated into features (
X) and target labels (y). In this case,Xwould be the 4-dimensional measurements of the iris flowers, andywould be the corresponding species labels (0, 1, or 2). - Splitting the Data:
train_test_splitfromsklearn.model_selectionis used to split the data into training and testing subsets. Thetest_size=0.2parameter means that 20% of the data will be used for testing, while the remaining 80% will be used for training.random_state=42ensures reproducibility of the split; using the same seed (42 here) will yield the same split every time the code is run. - Return Values: The function returns a tuple containing
X_train,X_test,y_train, andy_test, which are the feature and target sets for both the training and testing data.
- Loads the Iris dataset:
Create and Train the SVM Model
In this step, you will learn how to create a Support Vector Classifier model and train it on the training data.
## Continue in the same file
def create_and_train_SVM(X_train: list, y_train: list) -> SVC:
"""
Args:
X_train: [features for training]
y_train: [labels for training]
Returns:
SVC: [Trained Support Vector Classifier model]
"""
svm = SVC()
svm.fit(X_train, y_train)
return svm
This function, create_and_train_SVM, is designed to instantiate a Support Vector Classifier (SVM) model using the sklearn.svm.SVC class and then train it on the provided training data. Here's a detailed explanation:
- Function Signature: The function takes two arguments:
X_train: A list or array-like object containing the features (input variables) for the training dataset.y_train: A list or array-like object containing the corresponding labels (output variables) for the training dataset.
- Instantiating an SVM Model: Inside the function,
SVC()is called without any parameters. This creates a default Support Vector Classifier model. The SVC class in scikit-learn offers various parameters to customize the model, such as kernel type, regularization, etc., but in this basic example, default values are used. - Training the Model: The
fitmethod of thesvmobject is called withX_trainandy_train. This is where the actual training occurs—the model learns patterns from the features (X_train) associated with their respective class labels (y_train). - Returning the Trained Model: After training, the function returns the trained
SVCmodel. This model can then be used for making predictions on new, unseen data or for evaluating its performance using a test dataset.
Make Predictions
In this step, you will learn how to make predictions using the trained SVM model.
## Continue in the same file
def make_predictions(model: SVC, X_test: list) -> list:
"""
Args:
model: [Trained Support Vector Classifier model]
X_test: [features for testing]
Returns:
list: [Predictions]
"""
predictions = model.predict(X_test)
return predictions
The function make_predictions takes a trained SVM model and a set of test features as inputs, and it returns a list of predicted labels for the test data. Here's a breakdown:
- Function Arguments:
model: This is an instance of theSVCclass (Support Vector Classifier) that has already been trained on a dataset. It's assumed that the model knows how to classify new instances based on the patterns it learned during the training phase.X_test: A list or array-like object containing the features (input variables) for the test dataset. These are the unseen examples that the model will predict labels for.
- Making Predictions: Inside the function, the
predictmethod of themodelis invoked withX_testas its argument. Thepredictmethod applies the learned model to each instance in the test set to estimate their class labels. It doesn't require the true labels (y_test), only the input features. - Returning Predictions: The function then returns these estimated labels as a list. Each element in the returned list corresponds to the predicted class label of the respective instance in the
X_testdataset.
Evaluate the Model
Evaluate the model by calculating the accuracy score and displaying the classification report.
## Continue in the same file
if __name__ == "__main__":
## Load and split the data
X_train, X_test, y_train, y_test = load_and_split_data()
## Create and train the SVM model
svm_model = create_and_train_SVM(X_train, y_train)
## Make predictions
predictions = make_predictions(svm_model, X_test)
## Evaluate the model
accuracy = accuracy_score(y_test, predictions)
print(f"Accuracy: {accuracy:.2f}")
## Display classification report
print("Classification Report:")
print(classification_report(y_test, predictions))
Now, run the script from the terminal:
python iris_classification_svm.py
The output should be:
Accuracy: 1.00
Classification Report:
precision recall f1-score support
0 1.00 1.00 1.00 10
1 1.00 1.00 1.00 9
2 1.00 1.00 1.00 11
accuracy 1.00 30
macro avg 1.00 1.00 1.00 30
weighted avg 1.00 1.00 1.00 30
By following these steps, you have completed the project of classifying the iris dataset using a Support Vector Classifier (SVC) model.
Summary
Congratulations! You have completed this project. You can practice more labs in LabEx to improve your skills.



