Yes, absolutely! You can combine boolean indexing (conditions) with the .loc operator. This is one of the most powerful features in Pandas.
When you use a condition with .loc, Pandas filters the rows first based on whether the condition is True, and then selects the columns you specified.
Example: Slicing by Condition
Suppose you only want to see the name and major of students who are older than 21. You can do it like this:
import pandas as pd
df = pd.read_csv('students.csv')
# Condition: df['age'] > 21
# Columns: 'name' through 'major'
filtered_data = df.loc[df['age'] > 21, 'name':'major']
print(filtered_data)
How it works:
- Row Selector (
df['age'] > 21): Pandas creates a mask of True/False values for every row. Only rows that areTrueare selected. - Column Selector (
'name':'major'): It then picks only the columns labels from 'name' to 'major'.
Why use .loc for conditions?
While you could just write df[df['age'] > 21], using .loc is better because:
- Readability: It clearly separates row filtering and column selection.
- Efficiency: It is faster and more explicit when you only need a subset of columns.
- Assignment: If you want to change data (e.g., "Set the score to 100 for all Engineering majors"),
.locis the safest way to do it:df.loc[df['major'] == 'Engineering', 'score'] = 100
Give it a try in your main.py! You can try filtering for students with a specific major or a score above a certain value.