Working with Pandas

PythonPythonBeginner
Practice Now

This tutorial is from open-source community. Access the source code

Introduction

Pandas is a powerful data manipulation tool developed by Python. It's often used in data analysis and cleaning because it's flexible and easy to use. In this lab, we will learn how to use Pandas to perform basic operations like loading data, creating data frames, accessing data, and performing simple statistics.

VM Tips

After the VM startup is done, click the top left corner to switch to the Notebook tab to access Jupyter Notebook for practice.

Sometimes, you may need to wait a few seconds for Jupyter Notebook to finish loading. The validation of operations cannot be automated because of limitations in Jupyter Notebook.

If you face issues during learning, feel free to ask Labby. Provide feedback after the session, and we will promptly resolve the problem for you.


Skills Graph

%%%%{init: {'theme':'neutral'}}%%%% flowchart RL python(("`Python`")) -.-> python/BasicConceptsGroup(["`Basic Concepts`"]) pandas(("`Pandas`")) -.-> pandas/DataSelectionGroup(["`Data Selection`"]) python(("`Python`")) -.-> python/DataStructuresGroup(["`Data Structures`"]) python(("`Python`")) -.-> python/ModulesandPackagesGroup(["`Modules and Packages`"]) python(("`Python`")) -.-> python/DataScienceandMachineLearningGroup(["`Data Science and Machine Learning`"]) python(("`Python`")) -.-> python/FunctionsGroup(["`Functions`"]) python/BasicConceptsGroup -.-> python/comments("`Comments`") pandas/DataSelectionGroup -.-> pandas/select_columns("`Select Columns`") python/DataStructuresGroup -.-> python/lists("`Lists`") python/DataStructuresGroup -.-> python/tuples("`Tuples`") python/DataStructuresGroup -.-> python/dictionaries("`Dictionaries`") python/ModulesandPackagesGroup -.-> python/importing_modules("`Importing Modules`") python/DataScienceandMachineLearningGroup -.-> python/numerical_computing("`Numerical Computing`") python/DataScienceandMachineLearningGroup -.-> python/data_analysis("`Data Analysis`") python/FunctionsGroup -.-> python/build_in_functions("`Build-in Functions`") subgraph Lab Skills python/comments -.-> lab-65430{{"`Working with Pandas`"}} pandas/select_columns -.-> lab-65430{{"`Working with Pandas`"}} python/lists -.-> lab-65430{{"`Working with Pandas`"}} python/tuples -.-> lab-65430{{"`Working with Pandas`"}} python/dictionaries -.-> lab-65430{{"`Working with Pandas`"}} python/importing_modules -.-> lab-65430{{"`Working with Pandas`"}} python/numerical_computing -.-> lab-65430{{"`Working with Pandas`"}} python/data_analysis -.-> lab-65430{{"`Working with Pandas`"}} python/build_in_functions -.-> lab-65430{{"`Working with Pandas`"}} end

Import the Pandas Package

Before you can use Pandas, you need to import it. It's a common practice to import Pandas with the alias pd.

## Importing pandas package
import pandas as pd

Create a DataFrame

Data in pandas is stored in a DataFrame, which is a 2-dimensional labeled data structure with columns potentially of different types.

## Creating a DataFrame
df = pd.DataFrame(
    {
        "Name": [
            "Braund, Mr. Owen Harris",
            "Allen, Mr. William Henry",
            "Bonnell, Miss. Elizabeth",
        ],
        "Age": [22, 35, 58],
        "Sex": ["male", "male", "female"],
    }
)

Select a Column

If you want to work with data in a specific column, you can select it using the column label. The result is a pandas Series.

## Selecting the 'Age' column
df["Age"]

Perform Basic Statistics

Pandas provides a lot of functionalities to perform statistics. For instance, you can find the maximum value in a column using max().

## Finding the maximum age
df["Age"].max()

You can also get a quick overview of the numerical data in a DataFrame using describe().

## Describing the numerical data
df.describe()

Summary

In this lab, we learned how to import the Pandas package, create a DataFrame, select a column, and perform basic statistics. Pandas is a versatile tool that can handle data of different types, making it a great choice for data analysis and manipulation.

Other Python Tutorials you may like