๐ Selecting Data
In the previous tutorial, we learned how to create DataFrames. Now, we will master the art of slicing and dicing data. We will learn how to select specific Columns, specific Rows, and how to combine both techniques.
๐น 1. Column Selection
Selecting columns is the most common operation in Pandas. We use square brackets [] directly on the DataFrame.
A. Selecting a Single Column
To select a single column and keep it as a DataFrame, we use double square brackets.
๐งช Code Example
import pandas as pd
# Setup our sample DataFrame
data = {
"Name": ["Atul", "Aman", "Harsh", "Rishi", "Devansh"],
"Class": [11, 39, 47, 56, 76],
"Status": ["pass", "fail", "pass", "pass", "fail"]
}
table = pd.DataFrame(data)
# Select only the "Name" column
print(table[["Name"]])
๐ค Output
Name
0 Atul
1 Aman
2 Harsh
3 Rishi
4 Devansh
⚠ Important Note: Series vs. DataFrame
- If you use single brackets
table["Name"], Pandas returns a Series (a simple list). - If you use double brackets
table[["Name"]], Pandas returns a DataFrame (a table). - Always use double brackets if you want your output to look like a table.
B. Selecting Multiple Columns
To select multiple columns, we pass a list of column names inside the brackets. This creates a 2D array structure.
๐งช Code Example
# Select "Name" and "Class" columns
print(table[["Name", "Class"]])
๐ค Output
Name Class
0 Atul 11
1 Aman 39
2 Harsh 47
3 Rishi 56
4 Devansh 76
๐น 2. Row Selection (The .loc Method)
To select rows, we use the .loc (Location) property. This looks up rows by their Index Label.
A. Selecting a Single Row
Use .loc with double brackets to select a specific row index.
๐งช Code Example
# Select the row at index 1
print(table.loc[[1]])
๐ค Output
Name Class Status
1 Aman 39 fail
B. Selecting Multiple Rows
Pass a list of indices to select multiple specific rows.
๐งช Code Example
# Select rows 0, 1, and 2
print(table.loc[[0, 1, 2]])
๐ค Output
Name Class Status
0 Atul 11 pass
1 Aman 39 fail
2 Harsh 47 pass
โน Note on Indexing:
.loc looks for the Label of the index, not the position.
While our rows are numbered 0, 1, 2 by default, if your rows were named "A", "B", "C", you would use .loc[["A"]].
๐น 3. Combined Selection (Rows & Columns)
This is the most powerful way to select data. You can pinpoint specific data by defining both the rows and the columns you want.
The syntax follows this pattern:
table.loc[ [Row_Labels] , [Column_Labels] ]
A. Specific Row, Specific Column
Here, we ask for Row 0 and only the Name column.
๐งช Code Example
print(table.loc[[0], ["Name"]])
๐ค Output
Name
0 Atul
๐ Explanation
- Pandas looks at Row index 0.
- It then intersects that with the Column "Name".
- It returns the exact cell value but formatted as a DataFrame.
B. Multiple Rows, Multiple Columns
Here, we select a block of data: Rows 0, 1, 2 and Columns Name, Class.
๐งช Code Example
print(table.loc[[0, 1, 2], ["Name", "Class"]])
๐ค Output
Name Class
0 Atul 11
1 Aman 39
2 Harsh 47
๐ Explanation
- First Argument
[0, 1, 2]: Filters the DataFrame to keep only these three rows. - Second Argument
["Name", "Class"]: Filters columns to keep only Name and Class. - The result is a smaller subset table containing only the requested intersection.
๐ Summary
| Goal | Syntax |
|---|---|
| Just Columns | df[["Col1", "Col2"]] |
| Just Rows | df.loc[[0, 1]] |
| Both (Intersection) | df.loc[[Rows], [Cols]] |