Forcing Structured JSON Output
One of the most powerful features of Large Language Models is their ability to read messy, unstructured text and turn it into clean, structured data. In this tutorial, we will learn how to force an Ollama model to output data in a strict JSON format using a schema.
This is incredibly useful if you are building an application or API and need the AI to return data that your code can easily read and process.
📝 Step 1: The Unstructured Prompt
First, let's import the library and define the text we want the AI to analyze. Imagine we scraped this paragraph from a resume or a LinkedIn profile.
import ollama
# The messy, unstructured text we want to extract data from
prompt = """
"John Doe is a 25 year old software developer. His email is john@gmail.com. He knows Python, JavaScript and FastAPI. He has 3 years of experience."
"""
🏗️ Step 2: Defining the Format Schema
Next, we need to tell the model exactly how we want the output formatted. We do this by creating a dictionary that acts as a blueprint (a JSON Schema).
# The format blueprint we want the LLM to follow
format_schema = {
"type": "object",
"properties": {
"name": {"type": "string"},
"age": {"type": "integer"},
"email": {"type": "string"},
"skills": {
"type": "array",
"items": {"type": "string"}
},
"experience": {"type": "integer"}
},
"required": ["name", "age", "email", "skills", "experience"]
}
💡 What is happening here? We are instructing the AI that it MUST return an object containing exactly these five keys, and we are even enforcing the data types (e.g., age must be a number, skills must be a list).
🚀 Step 3: Generating the Response
Finally, we pass both the prompt and our format_schema to the generate function.
# Pass the prompt and the format schema to the model
response = ollama.generate(
model="llama3.2:latest",
prompt=prompt,
format=format_schema
)
# Print the final structured data
print(response['response'])
📌 Expected Output
Magic! The model completely ignored conversational filler and returned perfectly formatted JSON that you can immediately parse into a Python dictionary or save to a database.
⚠️ Important Note on Missing Data
fake@email.com) to satisfy the schema.
🛠️ The Fix: To make this more efficient and accurate, always add a strict instruction to your prompt like: "If a specific piece of data is not found in the text, output NULL for that field. Do not invent data."