How to Fix 'memory not persisting' in AutoGen (Python)
What “memory not persisting” usually means
In AutoGen, this usually means your agent can answer in the current turn, but the next turn starts with a blank slate. You’ll see symptoms like assistant_response forgetting earlier facts, ConversableAgent not retaining chat history, or your custom memory store never being read back after process restart.
This shows up most often when people expect AutoGen to persist state automatically, but they only configured in-memory chat history. AutoGen will keep messages during the Python process unless you explicitly wire persistence to disk, a database, or your own memory backend.
The Most Common Cause — You’re using in-memory chat history and expecting persistence
The #1 mistake is assuming ConversableAgent or AssistantAgent will persist messages across runs by default. They won’t. If you restart the script, redeploy the container, or create a new agent instance, the conversation history is gone.
Broken vs fixed pattern
| Broken pattern | Fixed pattern |
|---|---|
| Creates a new agent every run and relies on internal message history | Loads and saves state explicitly |
Uses only chat_messages in memory | Persists messages to disk/DB and rehydrates them on startup |
# BROKEN: history exists only in memory for this Python process
from autogen import ConversableAgent
assistant = ConversableAgent(
name="assistant",
llm_config={"config_list": [{"model": "gpt-4o-mini", "api_key": "YOUR_KEY"}]},
)
user = ConversableAgent(name="user", human_input_mode="NEVER")
# First run works
user.initiate_chat(assistant, message="My policy number is P-12345")
# Second run after restart: assistant forgets everything
# FIXED: persist state yourself and reload it on startup
import json
from pathlib import Path
from autogen import ConversableAgent
STATE_FILE = Path("chat_state.json")
def load_state():
if STATE_FILE.exists():
return json.loads(STATE_FILE.read_text())
return {"messages": []}
def save_state(state):
STATE_FILE.write_text(json.dumps(state, indent=2))
state = load_state()
assistant = ConversableAgent(
name="assistant",
llm_config={"config_list": [{"model": "gpt-4o-mini", "api_key": "YOUR_KEY"}]},
)
user = ConversableAgent(name="user", human_input_mode="NEVER")
# Rehydrate prior messages if your workflow supports it
assistant._oai_messages = state["messages"]
chat_result = user.initiate_chat(assistant, message="Continue from previous context")
state["messages"] = assistant._oai_messages
save_state(state)
If you’re using AutoGen’s newer agent abstractions, the same rule applies: conversation memory is not durable unless you make it durable. A fresh process means fresh objects.
Other Possible Causes
1) You recreate the agent on every request
If you instantiate AssistantAgent inside a web handler or function that runs per request, the object dies at the end of the request.
# BROKEN
def handle_request(prompt: str):
assistant = AssistantAgent(
name="assistant",
model_client=model_client,
)
return assistant.run(prompt)
# FIXED
assistant = AssistantAgent(
name="assistant",
model_client=model_client,
)
def handle_request(prompt: str):
return assistant.run(prompt)
If you need per-user memory, keep one agent per session ID and store it in Redis, Postgres, or a durable cache.
2) You’re clearing messages accidentally
A lot of “memory not persisting” bugs are self-inflicted by code that resets state between turns.
# BROKEN
messages = []
messages.append({"role": "user", "content": "Remember my claim ID is C-7788"})
messages = [] # accidental reset
# FIXED
session_store.setdefault(session_id, [])
session_store[session_id].append({"role": "user", "content": "Remember my claim ID is C-7788"})
If you use GroupChat, check for code that rebuilds the chat object every turn. That wipes prior context too.
3) Your termination logic ends the chat before memory is written
Some AutoGen flows stop on max_turns, is_termination_msg, or a custom reply hook. If your persistence happens after chat completion and the flow exits early, nothing gets saved.
# BROKEN: save step never runs if exception/termination happens first
result = user.initiate_chat(assistant, message="Start")
save_chat(result.chat_history)
# FIXED: persist in a finally block or callback hook
try:
result = user.initiate_chat(assistant, message="Start")
finally:
save_chat(getattr(result, "chat_history", []))
Also check for termination messages like:
def is_termination_msg(msg):
return msg.get("content", "").strip().lower() == "done"
If your agent emits "done" too early, you’ll think memory vanished when the conversation just ended.
4) You configured an LLM but not an actual memory backend
AutoGen’s model config is not memory storage. This is a common confusion when people see llm_config and assume it covers persistence too.
# BROKEN: model config only
assistant = AssistantAgent(
name="assistant",
model_client=model_client,
)
# FIXED: pair model config with explicit storage/retrieval layer
memory = MyPersistentMemoryStore(redis_client)
assistant = AssistantAgent(
name="assistant",
model_client=model_client,
)
memory.save(session_id, {"claim_id": "C-7788"})
context = memory.load(session_id)
For bank and insurance workloads, this should be session-scoped and auditable. Don’t dump private customer data into ad hoc globals.
How to Debug It
- •
Check whether the bug survives a single process
- •Run two turns in one Python session.
- •If memory works there but fails after restart, you have no persistence layer.
- •If it fails even within one session, you’re probably recreating or clearing agents/messages.
- •
Print object identity
- •Log
id(agent)or equivalent across requests. - •If it changes per request, you’re creating a new instance every time.
- •Log
- •
Inspect stored messages before and after each turn
- •Dump
agent._oai_messages, your session dict, or your DB row count. - •If messages exist before reply but disappear after reply, something is resetting them.
- •Dump
- •
Trace termination and save paths
- •Add logs around
is_termination_msg,max_turns, exception handlers, and persistence writes. - •A lot of “not persisting” reports are actually “never saved because execution stopped early.”
- •Add logs around
Prevention
- •Keep agent lifecycle separate from request lifecycle.
- •Create agents once per session or load them from durable storage.
- •Persist both:
- •conversation history
- •derived memory/state such as extracted entities, claims IDs, policy numbers
- •Add tests for:
- •same-process continuity
- •restart continuity
- •multi-user isolation
If you want one rule to remember: AutoGen remembers what exists in Python objects; it does not magically persist what your process destroys. Build storage explicitly and bind it to session identity.
Keep learning
- •The complete AI Agents Roadmap — my full 8-step breakdown
- •Free: The AI Agent Starter Kit — PDF checklist + starter code
- •Work with me — I build AI for banks and insurance companies
By Cyprian Aarons, AI Consultant at Topiax.
Want the complete 8-step roadmap?
Grab the free AI Agent Starter Kit — architecture templates, compliance checklists, and a 7-email deep-dive course.
Get the Starter Kit