Building a Chatbot with Memory ( History Strorage )

Building a Chatbot with Memory

In previous tutorials, we used the generate() function. However, generate() has a major flaw for building chatbots: it has amnesia. It forgets everything you said previously.

To build a true conversational AI that remembers context, we must use the chat() function. This function allows us to pass an entire history of our conversation back and forth.


📖 Step 1: Understanding the Message Format

Instead of passing a simple string as a prompt, the chat() function requires a List of Dictionaries. Every dictionary represents a single message in the conversation and must contain a role and content.

The 3 Roles:
  • system: The instructions you give the AI behind the scenes (e.g., "You are a helpful assistant").
  • user: The questions or prompts you type.
  • assistant: The answers the AI gives back.

Here is what a conversation history looks like:

messages = [
    {"role": "system", "content": "You are a helpful assistant."},
    {"role": "user", "content": "What is the capital of France?"},
    {"role": "assistant", "content": "The capital of France is Paris."},
    {"role": "user", "content": "What is the population of that city?"} # The AI knows "that city" is Paris!
]

🤖 Step 2: Coding a Continuous Chat Loop

Let's write a Python script that acts just like ChatGPT. We will use a while loop to keep the conversation going, and append every new message to our history list.

import ollama

# 1. Initialize our empty history list
message_history = []

# 2. Add the system prompt (Optional but recommended)
message_history.append({"role": "system", "content": "You are a helpful assistant."})

# 3. Start the continuous chat loop
while True:
    # Get input from the user
    user_input = input("User: ")

    # Provide a way to exit the loop
    if user_input.lower() == 'quit':
        print("Ending chat...")
        break
    
    # 4. Save the user's message to history
    message_history.append({"role": "user", "content": user_input})

    # 5. Send the ENTIRE history to the model
    response = ollama.chat(model="llama3.2:latest", messages=message_history)
    
    # Extract the actual text from the response
    ai_reply = response['message']['content']
    print("Assistant : ", ai_reply)
    
    # 6. Save the AI's reply to history so it remembers its own answer!
    message_history.append({"role": "assistant", "content": ai_reply})

📌 How it works:

Notice how we use response['message']['content'] to get the text. Unlike generate() which puts the text in response['response'], the chat() function returns a complete message dictionary that we can directly append back into our history!


👁️‍🗨️ Step 3: Adding Images to the Chat History

What if you want to upload an image and have a continuous conversation about that image? You can easily inject a Base64 encoded image into the chat history.

import base64
import ollama

# 1. Convert image to Base64 (Same as previous tutorials)
with open('image1.png', 'rb') as f:
    image_64 = base64.b64encode(f.read()).decode('utf-8')

# 2. Setup the history
message_history = []
message_history.append({"role": "system", "content": "You are a helpful assistant."})

# 3. Add the initial message WITH the image
message_history.append({
    "role": "user",
    "content": "This is the image I want to talk about. What do you see?",
    "images": [image_64]  # Notice the 'images' key!
})

print("Image uploaded. You can now ask questions about it. Type 'quit' to exit.")

# 4. Start the chat loop using a Vision Model
while True:
    user_input = input("User: ")
    if user_input.lower() == 'quit':
        break
    
    message_history.append({"role": "user", "content": user_input})

    # Make sure to use a model that supports vision, like gemma3:4b or llava
    response = ollama.chat(model="gemma3:4b", messages=message_history)
    
    ai_reply = response['message']['content']
    print("Assistant: ", ai_reply)

    message_history.append({"role": "assistant", "content": ai_reply})
⚠️ Warning: Memory Limitations
Because we are passing the entire conversation history back to the model every single time we send a new message, long conversations will consume more RAM/VRAM and take longer to process. For advanced applications, you will eventually need to implement logic to delete older messages from the list to keep the context window manageable.

🚀 Conclusion

By using ollama.chat() and maintaining a list of dictionaries, you can build fully functional, context-aware chatbots that remember user details, follow complex ongoing logic, and even discuss images dynamically.

Post a Comment

Do Leave Your Comments...

Previous Post Next Post

Contact Form