Building a Chatbot with Memory
In previous tutorials, we used the generate() function. However, generate() has a major flaw for building chatbots: it has amnesia. It forgets everything you said previously.
To build a true conversational AI that remembers context, we must use the chat() function. This function allows us to pass an entire history of our conversation back and forth.
📖 Step 1: Understanding the Message Format
Instead of passing a simple string as a prompt, the chat() function requires a List of Dictionaries. Every dictionary represents a single message in the conversation and must contain a role and content.
- system: The instructions you give the AI behind the scenes (e.g., "You are a helpful assistant").
- user: The questions or prompts you type.
- assistant: The answers the AI gives back.
Here is what a conversation history looks like:
messages = [
{"role": "system", "content": "You are a helpful assistant."},
{"role": "user", "content": "What is the capital of France?"},
{"role": "assistant", "content": "The capital of France is Paris."},
{"role": "user", "content": "What is the population of that city?"} # The AI knows "that city" is Paris!
]
🤖 Step 2: Coding a Continuous Chat Loop
Let's write a Python script that acts just like ChatGPT. We will use a while loop to keep the conversation going, and append every new message to our history list.
import ollama
# 1. Initialize our empty history list
message_history = []
# 2. Add the system prompt (Optional but recommended)
message_history.append({"role": "system", "content": "You are a helpful assistant."})
# 3. Start the continuous chat loop
while True:
# Get input from the user
user_input = input("User: ")
# Provide a way to exit the loop
if user_input.lower() == 'quit':
print("Ending chat...")
break
# 4. Save the user's message to history
message_history.append({"role": "user", "content": user_input})
# 5. Send the ENTIRE history to the model
response = ollama.chat(model="llama3.2:latest", messages=message_history)
# Extract the actual text from the response
ai_reply = response['message']['content']
print("Assistant : ", ai_reply)
# 6. Save the AI's reply to history so it remembers its own answer!
message_history.append({"role": "assistant", "content": ai_reply})
📌 How it works:
Notice how we use response['message']['content'] to get the text. Unlike generate() which puts the text in response['response'], the chat() function returns a complete message dictionary that we can directly append back into our history!
👁️🗨️ Step 3: Adding Images to the Chat History
What if you want to upload an image and have a continuous conversation about that image? You can easily inject a Base64 encoded image into the chat history.
import base64
import ollama
# 1. Convert image to Base64 (Same as previous tutorials)
with open('image1.png', 'rb') as f:
image_64 = base64.b64encode(f.read()).decode('utf-8')
# 2. Setup the history
message_history = []
message_history.append({"role": "system", "content": "You are a helpful assistant."})
# 3. Add the initial message WITH the image
message_history.append({
"role": "user",
"content": "This is the image I want to talk about. What do you see?",
"images": [image_64] # Notice the 'images' key!
})
print("Image uploaded. You can now ask questions about it. Type 'quit' to exit.")
# 4. Start the chat loop using a Vision Model
while True:
user_input = input("User: ")
if user_input.lower() == 'quit':
break
message_history.append({"role": "user", "content": user_input})
# Make sure to use a model that supports vision, like gemma3:4b or llava
response = ollama.chat(model="gemma3:4b", messages=message_history)
ai_reply = response['message']['content']
print("Assistant: ", ai_reply)
message_history.append({"role": "assistant", "content": ai_reply})
Because we are passing the entire conversation history back to the model every single time we send a new message, long conversations will consume more RAM/VRAM and take longer to process. For advanced applications, you will eventually need to implement logic to delete older messages from the list to keep the context window manageable.
🚀 Conclusion
By using ollama.chat() and maintaining a list of dictionaries, you can build fully functional, context-aware chatbots that remember user details, follow complex ongoing logic, and even discuss images dynamically.