Remove Duplicate Images using Python 🐍
Do you have a folder full of images and want to delete the duplicates? Here's a simple Python script that scans all the images in a folder and keeps only the unique ones based on their visual hash. This is especially useful for meme folders, downloaded wallpapers, or social media collections.
Features:
- Uses
imagehash
andPillow
to identify duplicates visually (not just by filename or size). - Skips corrupted or unreadable image files.
- Prints duplicate image pairs for reference.
- Automatically creates an output folder and saves only unique images there.
Python Code:
import os
import shutil
from PIL import Image, UnidentifiedImageError
import imagehash
# Input and output folder paths
input_folder = r'images_output' # Images Input Folder
output_folder = r'images_output' # Images Output Folder (you can change this to a different path)
# Create the output folder if it doesn't exist
os.makedirs(output_folder, exist_ok=True)
# Dictionary to store image hashes and their paths
image_hashes = {}
# Iterate through files in the folder
for filename in os.listdir(input_folder):
if filename.lower().endswith(('png', 'jpg', 'jpeg', 'gif', 'bmp', 'webp')):
img_path = os.path.join(input_folder, filename)
try:
with Image.open(img_path) as img:
img.load() # Force load image to catch truncation errors
# Calculate image hash
img_hash = imagehash.average_hash(img)
# Check if hash already exists
if img_hash in image_hashes:
print(f"Duplicate found: {img_path} and {image_hashes[img_hash]}")
else:
# Copy unique image to the output folder
shutil.copy(img_path, output_folder)
image_hashes[img_hash] = img_path
except (OSError, UnidentifiedImageError) as e:
print(f"Skipping file due to error: {img_path} — {e}")
print("✅ Unique images have been saved to the output folder.")
How to Use:
- Install the required libraries (if not already installed):
- Put all your images inside a folder (e.g.,
images_output
). - Run the script in your Python environment (VS Code, IDLE, or any terminal).
- All unique images will be copied to the same folder (or change output path to keep them separate).
pip install Pillow imagehash
Notes:
- It uses average_hash from
imagehash
, which is good for detecting visually similar images even with minor edits. - You can replace it with
phash
,whash
, ordhash
depending on your needs. - If you want to move (not copy) files, change
shutil.copy
toshutil.move
.
💡 Pro Tip: This can be used to clean meme folders before uploading them to Instagram, YouTube Shorts, etc.
🔁 Like this tip? Let me know in the comments!