Pandas Advanced Selection & Filtering
In the previous tutorial, we learned how to select rows and columns using manual indices (like row 0 or row 1). However, in real-world data analysis, we rarely know the row numbers.
Instead, we usually want to select data based on logic (e.g., "Show me all students who passed"). In this tutorial, we will learn how to chain selections and use Boolean Indexing to filter data.
๐น 1. Chaining Selections (Rows + Columns)
Sometimes you need to grab specific rows and immediately select specific columns from them. There are two ways to do this: the "One-Liner" way and the "Step-by-Step" way.
Method A: The "One-Liner" (Chaining)
You can chain the commands together. First, we select the rows using .loc, and then immediately select the column using [[ ]].
๐งช Code Example
import pandas as pd
data = {
"Name": ["Atul", "Aman", "Harsh", "Rishi", "Devansh"],
"Class": [11, 39, 47, 56, 76],
"Status": ["pass", "fail", "pass", "pass", "fail"]
}
table = pd.DataFrame(data)
# Select rows 1 and 4, then keep only the "Status" column
print(table.loc[[1, 4]][["Status"]])
๐ค Output
Status
1 fail
4 fail
Method B: The "Smart Way" (Step-by-Step)
While one-liners are fast, breaking the code into steps makes it easier to debug and understand. You create an intermediate DataFrame first.
๐งช Code Example
# Step 1: Select the specific rows first
selected_row = table.loc[[1, 4]]
print("--- Selected Rows ---")
print(selected_row)
print("*" * 30)
# Step 2: Select the column from that result
selected_column = selected_row[["Status"]]
print("--- Final Result ---")
print(selected_column)
๐ค Output
--- Selected Rows ---
Name Class Status
1 Aman 39 fail
4 Devansh 76 fail
******************************
--- Final Result ---
Status
1 fail
4 fail
๐น 2. Filtering by Condition (Boolean Indexing)
This is the most essential skill in Pandas. Instead of asking for "Row 0", we ask for "Rows where Status is pass". This is called Filtering.
A. Basic Filtering
We create a condition inside the brackets: table["Status"] == "pass".
Pandas checks every row, and if the condition is True, it keeps the row.
๐งช Code Example
# Select only rows where Status is "pass"
pass_table = table[table["Status"] == "pass"]
print(pass_table)
๐ค Output
Name Class Status
0 Atul 11 pass
2 Harsh 47 pass
3 Rishi 56 pass
B. Filter + Column Selection
Often, you want to filter the data first, and then only see specific columns (e.g., "Get the Names of everyone who passed").
๐งช Code Example
# Step 1: Filter the table
pass_table = table[table["Status"] == "pass"]
# Step 2: Select the Name column from the filtered table
pass_name = pass_table[["Name"]]
print(pass_name)
๐ค Output
Name
0 Atul
2 Harsh
3 Rishi
๐น 3. Filtering with Multiple Conditions
What if you have two requirements? For example, you want students who Passed AND are in Class 11.
In Pandas, we use the & symbol for "AND".
⚠ Important Syntax Rule: You must wrap each condition in parentheses ( ).
๐งช Code Example
# Filter: Status is "pass" AND Class is 11
pass_11_table = table[(table["Status"] == "pass") & (table["Class"] == 11)]
# Select just the Name column
pass_11_name = pass_11_table[["Name"]]
print(pass_11_name)
๐ค Output
Name
0 Atul
๐ Explanation
(table["Status"] == "pass"): Finds all students who passed.&: The logic operator for "AND".(table["Class"] == 11): Finds all students in class 11.- Pandas returns only the rows where both statements are True.
๐ Summary
| Technique | Syntax Code | Use Case |
|---|---|---|
| Chaining | df.loc[[1,2]][["Col"]] |
Quick extraction of specific cells. |
| Condition | df[df["Col"] == "Val"] |
Filtering rows by value. |
| Multi-Condition | df[(Cond1) & (Cond2)] |
Strict filtering with AND logic. |