AutoGen Tutorial (Python): adding memory to agents for intermediate developers

By Cyprian AaronsUpdated 2026-04-21
autogenadding-memory-to-agents-for-intermediate-developerspython

This tutorial shows you how to give an AutoGen agent persistent memory in Python using a simple, production-friendly pattern: store conversation facts outside the agent, then inject the relevant facts back into each new turn. You need this when your agent must remember user preferences, prior decisions, or case context across sessions instead of treating every request like a blank slate.

What You'll Need

  • Python 3.10+
  • autogen-agentchat
  • autogen-ext
  • OpenAI API key set as an environment variable
  • A place to persist memory:
    • for this tutorial: a local JSON file
    • for production: Postgres, Redis, or a vector store
  • Basic familiarity with AutoGen AssistantAgent and async Python

Install the packages:

pip install autogen-agentchat autogen-ext openai

Set your API key:

export OPENAI_API_KEY="your-key-here"

Step-by-Step

  1. Create a small memory store that reads and writes JSON on disk. This keeps the example simple while still showing the real pattern: separate memory from the agent runtime.
import json
from pathlib import Path

MEMORY_FILE = Path("agent_memory.json")

def load_memory() -> dict:
    if MEMORY_FILE.exists():
        return json.loads(MEMORY_FILE.read_text())
    return {}

def save_memory(memory: dict) -> None:
    MEMORY_FILE.write_text(json.dumps(memory, indent=2))
  1. Build a helper that extracts durable facts from user input. In real systems, this could be powered by rules, embeddings, or a classifier; here we keep it deterministic so you can see exactly what is happening.
def update_memory(memory: dict, user_text: str) -> dict:
    user_text_lower = user_text.lower()

    if "my name is" in user_text_lower:
        name = user_text.split("my name is", 1)[1].strip().split()[0]
        memory["name"] = name

    if "i work at" in user_text_lower:
        company = user_text.split("i work at", 1)[1].strip().split(".")[0]
        memory["company"] = company

    if "preferred language" in user_text_lower:
        lang = user_text.split("preferred language", 1)[1].strip().split(".")[0]
        memory["preferred_language"] = lang

    return memory
  1. Create the AutoGen agent and inject stored memory into every prompt. The key idea is that the model does not “remember” anything by itself; you provide context from your external store before each response.
import asyncio
from autogen_agentchat.agents import AssistantAgent
from autogen_ext.models.openai import OpenAIChatCompletionClient

MODEL_CLIENT = OpenAIChatCompletionClient(model="gpt-4o-mini")

agent = AssistantAgent(
    name="memory_agent",
    model_client=MODEL_CLIENT,
    system_message=(
        "You are a helpful assistant. Use provided memory when relevant. "
        "If memory contains a user's name, company, or preferred language, "
        "use it naturally in your response."
    ),
)

def build_prompt(user_text: str, memory: dict) -> str:
    return f"""Known memory:
{json.dumps(memory, indent=2)}

User message:
{user_text}
"""
  1. Wire everything together in an async chat loop. This loop loads memory, updates it from the new message, saves it back to disk, then sends both the message and stored context to the agent.
async def chat_once(user_text: str) -> str:
    memory = load_memory()
    memory = update_memory(memory, user_text)
    save_memory(memory)

    prompt = build_prompt(user_text, memory)
    result = await agent.run(task=prompt)
    return result.messages[-1].content

async def main():
    first = await chat_once("Hi, my name is Sam. I work at Northwind.")
    print(first)

if __name__ == "__main__":
    asyncio.run(main())
  1. Test persistence across separate runs. Send one message that sets facts, then restart the script and ask a follow-up question that depends on those facts.
async def demo():
    print(await chat_once("My name is Sam and my preferred language is Python."))
    print(await chat_once("What should you call me?"))
    print(await chat_once("What language do I prefer?"))

# Uncomment to run the demo instead of main()
# if __name__ == "__main__":
#     asyncio.run(demo())

Testing It

Run the script once with a message that includes durable facts like your name or company. Then run it again and ask a follow-up question such as “What’s my name?” or “Where do I work?” If the file-backed memory is working, the second run should still have access to those details because they were saved outside the agent.

A good test is to delete agent_memory.json, rerun the first message, and confirm that only new information gets remembered. Then inspect the JSON file directly to verify that your extracted fields are being persisted correctly.

Next Steps

  • Replace the JSON file with Redis or Postgres so memory survives multiple app instances.
  • Add semantic retrieval for long-term notes instead of only storing exact fields.
  • Use an LLM-based extractor to turn free-form chat into structured memories like preferences, entities, and decisions.

Keep learning

By Cyprian Aarons, AI Consultant at Topiax.

Want the complete 8-step roadmap?

Grab the free AI Agent Starter Kit — architecture templates, compliance checklists, and a 7-email deep-dive course.

Get the Starter Kit

Related Guides