๐ง Image Analysis Pipeline Using QR Detection, OCR, and Ollama (Python)
In this project, we build an intelligent image understanding pipeline using Python. The script extracts QR code links, performs OCR text extraction, and then uses a local LLaMA model via Ollama to generate a clean, factual summary of the image content.
This approach is ideal for notices, posters, documents, circulars, and scanned images that contain both text and QR codes.
๐ง What This Project Does
- ✔ Detects and extracts QR code links from an image
- ✔ Extracts text from the image using OCR
- ✔ Sends extracted data to a local LLaMA model
- ✔ Generates a concise, factual summary
- ✔ Preserves dates, links, and important details
๐ฆ Libraries Used & Their Purpose
- Pillow (PIL) – Opens and processes image files
- pyzbar – Detects and decodes QR codes from images
- pytesseract – Extracts text from images using OCR
- ollama – Sends extracted content to a local LLaMA model
⬇ Install Required Dependencies
Install Python libraries:
pip install pillow pyzbar pytesseract ollama
Install Tesseract OCR (Windows users):
- Download and install Tesseract OCR
- Set the correct path in the script
๐ค Ollama Model Used
This project uses:
llama3.2
Download the model:
ollama pull llama3.2
๐งช Python Code: Image → QR → OCR → Summary
#Atul || Chat GPT
import pytesseract
from PIL import Image
from ollama import chat
from pyzbar.pyzbar import decode
# -------------------- QR Code Detection --------------------
def qr_find(image_link):
img = Image.open(image_link)
results = decode(img)
qr_present = []
if results:
for qr in results:
qr_present.append(qr.data.decode())
else:
return "No QR code link found in this image"
return f"List of links present in image {qr_present}"
# -------------------- Image Path --------------------
image_link = "qr_image.png"
qr_list = qr_find(image_link)
# -------------------- Tesseract Path (Windows) --------------------
pytesseract.pytesseract.tesseract_cmd = r"C:\\Program Files\\Tesseract-OCR\\tesseract.exe"
# -------------------- OCR + LLaMA Summary --------------------
def image_summerize(image_link):
image = Image.open(image_link)
ocr_text = pytesseract.image_to_string(image).strip()
response = chat(
model="llama3.2",
messages=[
{
"role": "system",
"content": (
"You are a factual rewriting assistant.\\n"
"Rules:\\n"
"- Use ONLY the provided text\\n"
"- Do NOT infer or add information\\n"
"- Preserve all dates, times, and conditions\\n"
"- Rewrite clearly without shortening\\n"
"Extract only the important and relevant information.\\n"
"Present the output as concise bullet points."
)
},
{
"role": "user",
"content": (
"Rewrite the following notice while keeping all information intact:\\n\\n"
f"{ocr_text}\\n\\n"
"QR links present in the image:\\n"
f"{qr_list}"
)
}
]
)
return response["message"]["content"]
# -------------------- Run --------------------
print(image_summerize(image_link))
๐ Output
The final output is a clean bullet-point summary that includes:
- ✔ Important extracted text
- ✔ Dates, deadlines, and notices
- ✔ QR code links found in the image
๐ Use Cases
- ✔ Government notices
- ✔ Exam or admission circulars
- ✔ Event posters
- ✔ Scanned documents with QR links
- ✔ AI document automation
๐ Conclusion
This project demonstrates how multiple AI tools can work together to convert an image into structured, meaningful information. It is fully offline, privacy-friendly, and easily extendable into RAG pipelines or web applications.
Happy building intelligent image pipelines ๐
