How to chain multiple Pandas operations?

To chain multiple Pandas operations, you link methods together using the dot . operator. To make this readable, the industry standard is to wrap the entire expression in parentheses (), which allows you to put each operation on a new line.

Here is the pattern and a practical example:

1. The Basic Syntax

result = (df
    .operation1()
    .operation2()
    .operation3()
)

2. Practical Example

Let's take the students.csv data you've been working with. Suppose you want to:

Filter for students aged 21 and older.
Select only the name and score columns.
Sort them by the highest score.

import pandas as pd

df = pd.read_csv('students.csv')

# Chaining multiple operations
final_report = (df
    .loc[df['age'] >= 21]             # Step 1: Filter rows
    .loc[:, ['name', 'score']]       # Step 2: Select columns
    .sort_values(by='score', ascending=False) # Step 3: Sort
)

print(final_report)

3. Using `pipe()` for Custom Functions

If you have a custom operation that isn't a built-in Pandas method, you can use .pipe() to keep the chain going.

def add_bonus_points(data):
    data['score'] = data['score'] + 5
    return data

result = (df
    .loc[df['major'] == 'Engineering']
    .pipe(add_bonus_points) # Inject your own function into the chain
    .sort_values('score')
)

4. Key Tips for Chaining

Order Matters: Just like a factory assembly line, the output of the first method is the input for the next. Filtering early is usually faster.
Use .loc with Callables: As we discussed, if a previous step in the chain renames or creates a column, use a lambda inside .loc to access it:
```
.loc[lambda x: x['new_column'] > 50]
```
Debugging: If a long chain isn't working, comment out the bottom lines one by one to see where the data stops looking the way you expect.

Why do this? It keeps your code clean, avoids creating dozens of temporary variables like df2, df3, df_final, and makes your data processing logic look like a clear sequence of instructions.

Would you like to try rewriting your current script using a small chain?

How to chain multiple Pandas operations?

1. The Basic Syntax

2. Practical Example

3. Using pipe() for Custom Functions

4. Key Tips for Chaining

3. Using `pipe()` for Custom Functions