Pandas Data Manipulation Fundamentals

PythonPythonBeginner
Practice Now

This tutorial is from open-source community. Access the source code

Introduction

This Python Pandas Lab aims to introduce you to the fundamental operations of the pandas library, which is a powerful data manipulation tool in Python. Throughout this lab, you will work with numerous examples and code snippets to solidify your understanding of pandas.

VM Tips

After the VM startup is done, click the top left corner to switch to the Notebook tab to access Jupyter Notebook for practice.

Sometimes, you may need to wait a few seconds for Jupyter Notebook to finish loading. The validation of operations cannot be automated because of limitations in Jupyter Notebook.

If you face issues during learning, feel free to ask Labby. Provide feedback after the session, and we will promptly resolve the problem for you.


Skills Graph

%%%%{init: {'theme':'neutral'}}%%%% flowchart RL python(("`Python`")) -.-> python/BasicConceptsGroup(["`Basic Concepts`"]) python(("`Python`")) -.-> python/FileHandlingGroup(["`File Handling`"]) pandas(("`Pandas`")) -.-> pandas/DataSelectionGroup(["`Data Selection`"]) pandas(("`Pandas`")) -.-> pandas/DataCleaningGroup(["`Data Cleaning`"]) pandas(("`Pandas`")) -.-> pandas/DataVisualizationGroup(["`Data Visualization`"]) python(("`Python`")) -.-> python/DataStructuresGroup(["`Data Structures`"]) python(("`Python`")) -.-> python/ModulesandPackagesGroup(["`Modules and Packages`"]) python(("`Python`")) -.-> python/DataScienceandMachineLearningGroup(["`Data Science and Machine Learning`"]) python(("`Python`")) -.-> python/FunctionsGroup(["`Functions`"]) python/BasicConceptsGroup -.-> python/comments("`Comments`") python/FileHandlingGroup -.-> python/with_statement("`Using with Statement`") pandas/DataSelectionGroup -.-> pandas/select_columns("`Select Columns`") pandas/DataCleaningGroup -.-> pandas/handle_missing_values("`Handling Missing Values`") pandas/DataVisualizationGroup -.-> pandas/line_plots("`Line Plots`") python/BasicConceptsGroup -.-> python/booleans("`Booleans`") python/DataStructuresGroup -.-> python/lists("`Lists`") python/DataStructuresGroup -.-> python/tuples("`Tuples`") python/DataStructuresGroup -.-> python/dictionaries("`Dictionaries`") python/ModulesandPackagesGroup -.-> python/importing_modules("`Importing Modules`") python/DataScienceandMachineLearningGroup -.-> python/numerical_computing("`Numerical Computing`") python/DataScienceandMachineLearningGroup -.-> python/data_analysis("`Data Analysis`") python/DataScienceandMachineLearningGroup -.-> python/data_visualization("`Data Visualization`") python/FunctionsGroup -.-> python/build_in_functions("`Build-in Functions`") subgraph Lab Skills python/comments -.-> lab-65447{{"`Pandas Data Manipulation Fundamentals`"}} python/with_statement -.-> lab-65447{{"`Pandas Data Manipulation Fundamentals`"}} pandas/select_columns -.-> lab-65447{{"`Pandas Data Manipulation Fundamentals`"}} pandas/handle_missing_values -.-> lab-65447{{"`Pandas Data Manipulation Fundamentals`"}} pandas/line_plots -.-> lab-65447{{"`Pandas Data Manipulation Fundamentals`"}} python/booleans -.-> lab-65447{{"`Pandas Data Manipulation Fundamentals`"}} python/lists -.-> lab-65447{{"`Pandas Data Manipulation Fundamentals`"}} python/tuples -.-> lab-65447{{"`Pandas Data Manipulation Fundamentals`"}} python/dictionaries -.-> lab-65447{{"`Pandas Data Manipulation Fundamentals`"}} python/importing_modules -.-> lab-65447{{"`Pandas Data Manipulation Fundamentals`"}} python/numerical_computing -.-> lab-65447{{"`Pandas Data Manipulation Fundamentals`"}} python/data_analysis -.-> lab-65447{{"`Pandas Data Manipulation Fundamentals`"}} python/data_visualization -.-> lab-65447{{"`Pandas Data Manipulation Fundamentals`"}} python/build_in_functions -.-> lab-65447{{"`Pandas Data Manipulation Fundamentals`"}} end

Importing Pandas

Firstly, we need to import the pandas library. This can be done with the following command:

## Importing pandas library
import pandas as pd
import numpy as np

Creating a DataFrame

Next, we will create a DataFrame, which is a two-dimensional labeled data structure with columns of potentially different types. It is generally the most commonly used pandas object.

## Creating a DataFrame with a dictionary
df = pd.DataFrame({'A': [1, 2, 3]})

Understanding DataFrames

Now, let's try to understand more about the DataFrame we just created.

## Displaying the DataFrame
print(df)

## Info about the DataFrame
df.info()

Working with Missing Data

Pandas provides various methods for cleaning data and filling missing values.

## Creating a DataFrame with missing values
df = pd.DataFrame({'A': [1, 2, np.nan], 'B': [5, np.nan, np.nan], 'C': [1, 2, 3]})

## Filling missing values
df.fillna(value=0, inplace=True)

Data Visualization

Pandas provides data visualization by allowing integration with the Matplotlib library.

## Importing matplotlib library
import matplotlib.pyplot as plt

## Plotting a graph
df['A'].plot()
plt.show()

Summary

In this lab, we have covered some of the basics of the pandas library in Python, including importing the library, creating and manipulating a DataFrame, dealing with missing data, and visualizing the data. These skills are fundamental to any data analysis task in Python, and becoming proficient in pandas will allow you to handle and analyze data effectively.

Other Python Tutorials you may like