Pandas: Adding & Removing Columns
In data analysis, your dataset is rarely static. You often need to add new data (like assigning grades) or clean up unwanted data. In this tutorial, we will learn the different ways to create columns and the proper way to delete them.
๐น 1. Adding New Columns
There are two main ways to add a new column to a DataFrame: assigning a list of values or assigning a single value to everyone.
A. Assigning a List (Different Values for Each Row)
If you want every row to have specific data, you assign a list.
๐งช Code Example
import pandas as pd
# Setup sample data
data = {
"Name": ["Atul", "Aman", "Harsh", "Rishi", "Devansh"],
"Class": [11, 39, 47, 56, 76],
"Status": ["pass", "fail", "pass", "pass", "fail"]
}
table = pd.DataFrame(data)
# Add a "Grade" column with specific values
table["Grade"] = ["A", "AB", "C", "D", "E"]
print(table)
๐ค Output
Name Class Status Grade
0 Atul 11 pass A
1 Aman 39 fail AB
2 Harsh 47 pass C
...
⚠ Important Warning: The list length must match the DataFrame length exactly. If you have 5 rows, you must provide 5 values.
B. Assigning a Single Value (Broadcasting)
If you want to assign the same value to every single row, you just assign that one value. Pandas is smart enough to repeat it for you. This is called "Broadcasting".
๐งช Code Example
# Assign "Indian" to the "Country" column for EVERYONE
table["Country"] = "Indian"
print(table)
๐ค Output
Name Class Status Grade Country
0 Atul 11 pass A Indian
1 Aman 39 fail AB Indian
2 Harsh 47 pass C Indian
...
๐น 2. Removing Columns (The .drop Method)
To delete columns, we use the .drop() method.
A. Temporary Removal (Default Behavior)
By default, .drop() creates a new copy of the data with the column removed. The original table stays untouched.
Why axis=1?
axis=0(Default) = Searches for the label in the Rows (Index).axis=1= Searches for the label in the Columns (Headers).
axis=1, Pandas will look for "Grade" in the row numbers, fail to find it, and give an error.
๐งช Code Example
# Drop "Grade" temporarily
removed_temp = table.drop("Grade", axis=1)
print("----------Temporary Removal-----------------")
print(removed_temp)
print("----------Main Data (Unchanged)-------------")
print(table)
๐ค Output
----------Temporary Removal-----------------
Name Class Status Country
0 Atul 11 pass Indian
...
----------Main Data (Unchanged)-------------
Name Class Status Grade Country
0 Atul 11 pass A Indian
...
B. Permanent Removal (Inplace)
If you want to remove the column from the memory permanently, use inplace=True. This modifies the original variable.
๐งช Code Example
# Remove "Grade" forever from the original table
table.drop("Grade", axis=1, inplace=True)
print("----------Main Data (Modified)--------------")
print(table)
๐ค Output
----------Main Data (Modified)--------------
Name Class Status Country
0 Atul 11 pass Indian
...
๐น 3. Removing Multiple Columns
To remove more than one column, pass a list of column names.
๐งช Code Example
# Remove "Class" and "Status" temporarily
removed_temp = table.drop(["Class", "Status"], axis=1)
print(removed_temp)
๐ค Output
Name Country
0 Atul Indian
1 Aman Indian
...
๐ Summary
| Goal | Syntax Code | Note |
|---|---|---|
| Add List | df["Col"] = [1, 2, 3] |
Length must match rows. |
| Add Single Value | df["Col"] = "A" |
Fills entire column with "A". |
| Temp Drop | df.drop("Col", axis=1) |
Original data safe. |
| Perm Drop | ... inplace=True |
Original data modified. |