# Introduction This lab demonstrates how to impute missing data in a dataset using different techniques in scikit-learn. The dataset used here are the diabetes dataset with 10 features and California housing dataset with 8 features. The missing values can be replaced by the mean, the median, or the most frequent value using SimpleImputer. This lab will investigate different imputation techniques like imputation by the constant value, imputation by the mean value of each feature combined with a missing-ness indicator auxiliary variable, k nearest neighbor imputation, and iterative imputation. ## VM Tips After the VM startup is done, click the top left corner to switch to the **Notebook** tab to access Jupyter Notebook for practice. Sometimes, you may need to wait a few seconds for Jupyter Notebook to finish loading. The validation of operations cannot be automated because of limitations in Jupyter Notebook. If you face issues during learning, feel free to ask Labby. Provide feedback after the session, and we will promptly resolve the problem for you.
Click the virtual machine below to start practicing