Classifying Iris Using SVM
In this project, you will learn how to classify the iris dataset using a Support Vector Classifier (SVC) model. The iris dataset is a classic machine learning dataset that contains information about different species of irises, including their sepal length, sepal width, petal length, and petal width.
Pythonscikit-learn
Plotting Validation Curves
In machine learning, validation curves are used to determine the training and validation scores of a classifier for different hyperparameter values. This can help in selecting the best hyperparameters for a model. In this lab, we will use scikit-learn to plot validation curves for a support vector machine (SVM) classifier.
Machine Learningscikit-learn
Decision Tree Regression
In this lab, we will learn how to use the decision tree regression algorithm to fit a sine curve with additional noisy observation. The decision trees will be used to learn local linear regressions approximating the sine curve. We will see that if the maximum depth of the tree is set too high, the decision trees learn too fine details of the training data and learn from the noise, i.e. they overfit.
Machine Learningscikit-learn
Underfitting and Overfitting
This lab demonstrates the problems of underfitting and overfitting in machine learning, and how we can use linear regression with polynomial features to approximate nonlinear functions. We will use scikit-learn to generate data, fit models, and evaluate model performance.
Machine Learningscikit-learn
Decision Tree Analysis
The decision tree classifier is a popular machine learning algorithm used for classification and regression problems. It is a tree-based model that partitions the feature space into a set of non-overlapping regions and predicts the target value for each region. In this lab, we will learn how to analyse the decision tree structure to gain further insight into the relation between the features and the target to predict.
Machine Learningscikit-learn
Class Probabilities with VotingClassifier
In this lab, we will learn how to plot class probabilities calculated by the VotingClassifier in Scikit-Learn. We will use three different classifiers, including LogisticRegression, GaussianNB, and RandomForestClassifier, and average their predicted probabilities using the VotingClassifier. We will then visualize the probability weighting by fitting each classifier on the training set and plot the predicted class probabilities for the first sample in the dataset.
Machine Learningscikit-learn
Hierarchical Clustering with Connectivity Constraints
This lab demonstrates how to perform hierarchical clustering with connectivity constraints using the Scikit-learn library in Python. In hierarchical clustering, clusters are formed by recursively merging or splitting them based on the distance between them. Connectivity constraints can be used to restrict the formation of clusters based on the connectivity between data points, which can result in more meaningful clusters.
Machine Learningscikit-learn
Discriminant Analysis Classifiers Explained
Linear and Quadratic Discriminant Analysis (LDA and QDA) are two classic classifiers used in machine learning. LDA uses a linear decision surface, while QDA uses a quadratic decision surface. These classifiers are popular because they have closed-form solutions, work well in practice, and have no hyperparameters to tune.
Machine Learningscikit-learn
Model Selection: Choosing Estimators and Their Parameters
In machine learning, model selection is the process of choosing the best model for a given dataset. It involves selecting the appropriate estimator and tuning its parameters to achieve optimal performance. This tutorial will guide you through the process of model selection in scikit-learn.
Machine Learningscikit-learn
Exploring Scikit-Learn Datasets and Estimators
In this lab, we will explore the setting and the estimator object in scikit-learn, a popular machine learning library in Python. We will learn about datasets, which are represented as 2D arrays, and how to preprocess them for scikit-learn. We will also explore the concept of estimator objects, which are used to learn from data and make predictions.
Machine Learningscikit-learn
Revealing Iris Dataset Structure via Factor Analysis
Factor Analysis is a statistical method used to uncover patterns in data. It is often used to identify latent variables that explain correlations among observed variables. In this lab, we will use the Iris dataset to illustrate how Factor Analysis can be used to reveal the underlying structure of the data.
Machine Learningscikit-learn
Scikit-Learn Libsvm GUI
In this tutorial, you will learn how to use Scikit-learn Libsvm GUI, which is a simple graphical frontend for Libsvm mainly intended for didactic purposes. You can create data points by point and click and visualize the decision region induced by different kernels and parameter settings.
Machine Learningscikit-learn
Linear Models in Scikit-Learn
In this lab, we will explore linear models in scikit-learn. Linear models are a set of methods used for regression and classification tasks. They assume that the target variable is a linear combination of the features. These models are widely used in machine learning due to their simplicity and interpretability.
Machine Learningscikit-learn
Iris Flower Classification using Voting Classifier
In this lab, we will use Scikit-Learn's VotingClassifier to predict the class of iris flowers based on two features. We will compare the predictions of DecisionTreeClassifier, KNeighborsClassifier, and SVC classifiers individually, and then use VotingClassifier to combine their predictions and see if we get better results.
Machine Learningscikit-learn
Supervised Learning with Scikit-Learn
In supervised learning, we want to learn the relationship between two datasets: the observed data X and an external variable y that we want to predict.
Machine Learningscikit-learn
Exploring Scikit-Learn SGD Classifiers
In this lab, we will explore Stochastic Gradient Descent (SGD), which is a powerful optimization algorithm commonly used in machine learning for solving large-scale and sparse problems. We will learn how to use the SGDClassifier and SGDRegressor classes from the scikit-learn library to train linear classifiers and regressors.
Machine Learningscikit-learn
Diabetes Prediction Using Voting Regressor
In this lab, we will use a Voting Regressor to predict the progression of diabetes in patients. We will use three different regressors to predict the data: Gradient Boosting Regressor, Random Forest Regressor, and Linear Regression. Then the above 3 regressors will be used for the Voting Regressor. Finally, we will plot the predictions made by all models for comparison.
Machine Learningscikit-learn
Wikipedia PageRank with Randomized SVD
In this lab, we will be analyzing the graph of links inside Wikipedia articles to rank articles by relative importance according to the eigenvector centrality. The traditional way to compute the principal eigenvector is to use the power iteration method. Here we will be using Martinsson's Randomized SVD algorithm implemented in scikit-learn.
Machine Learningscikit-learn