Introduction
In this lab, you will learn how to perform semi-supervised classification on a text dataset using scikit-learn. Semi-supervised learning is a type of machine learning where a model is trained on both labeled and unlabeled data. This lab will cover how to use Self-Training and LabelSpreading algorithms for semi-supervised text classification. We will be using the 20 newsgroups dataset to train and test our models.
VM Tips
After the VM startup is done, click the top left corner to switch to the Notebook tab to access Jupyter Notebook for practice.
Sometimes, you may need to wait a few seconds for Jupyter Notebook to finish loading. The validation of operations cannot be automated because of limitations in Jupyter Notebook.
If you face issues during learning, feel free to ask Labby. Provide feedback after the session, and we will promptly resolve the problem for you.