How to Fix 'memory not persisting during development' in AutoGen (Python)
What the error actually means
If your AutoGen agent “forgets” prior messages during development, the problem is usually not memory itself. It’s almost always one of two things: you’re recreating the agent each turn, or you’re using a chat pattern that doesn’t persist state across calls.
The common symptom looks like this:
ValueError: No conversation history found for agent
Or more often, nothing crashes — the agent just responds as if it never saw the previous message.
The Most Common Cause
The #1 cause is re-instantiating AssistantAgent or UserProxyAgent inside a function that runs on every request. That gives you a fresh object every time, so chat_history, messages, and internal state are reset.
Broken pattern vs fixed pattern
| Broken | Fixed |
|---|---|
| Agent created inside request handler | Agent created once and reused |
New GroupChat / GroupChatManager per call | Persistent manager/session object |
| History stored in local variables only | History stored outside the function |
# BROKEN: agent recreated on every call
from autogen import AssistantAgent
def ask_model(message: str):
assistant = AssistantAgent(
name="assistant",
llm_config={"config_list": [{"model": "gpt-4o-mini", "api_key": "..." }]}
)
result = assistant.generate_reply(messages=[{"role": "user", "content": message}])
return result
# FIXED: agent lives outside the request path
from autogen import AssistantAgent
assistant = AssistantAgent(
name="assistant",
llm_config={"config_list": [{"model": "gpt-4o-mini", "api_key": "..."}]}
)
def ask_model(message: str):
# Reuse the same agent instance so state can accumulate
result = assistant.generate_reply(messages=[{"role": "user", "content": message}])
return result
If you’re using initiate_chat(), the same rule applies. Don’t create a new conversation object every time unless you intentionally want stateless behavior.
# BROKEN: new chat session each call
def run_chat(user_message):
user = UserProxyAgent(name="user")
assistant = AssistantAgent(name="assistant", llm_config=llm_config)
user.initiate_chat(assistant, message=user_message)
# FIXED: keep agents and chat manager alive across turns
user = UserProxyAgent(name="user")
assistant = AssistantAgent(name="assistant", llm_config=llm_config)
def run_chat(user_message):
user.initiate_chat(assistant, message=user_message)
Other Possible Causes
1) You’re using stateless API calls instead of chat history
If you call the model with only the latest user message, AutoGen has no prior context to send.
# Broken: single-turn payload only
messages = [{"role": "user", "content": "What did I say earlier?"}]
# Fixed: append to shared history
messages.append({"role": "user", "content": "What did I say earlier?"})
response = assistant.generate_reply(messages=messages)
2) Your history variable gets overwritten
This happens a lot in notebooks and Flask/FastAPI handlers.
# Broken: history reset on each function call
def handle_turn(user_text):
history = []
history.append({"role": "user", "content": user_text})
return assistant.generate_reply(messages=history)
# Fixed: persist history outside the function
history = []
def handle_turn(user_text):
history.append({"role": "user", "content": user_text})
return assistant.generate_reply(messages=history)
3) You are mixing up AutoGen versions or APIs
Older AutoGen examples use different patterns than newer ones. If you copy code from an old blog post, methods like initiate_chat() or memory-related fields may not behave as expected in your installed version.
Check your installed package:
pip show pyautogen
pip show autogen-agentchat
And confirm which API you’re using:
from autogen import AssistantAgent # older-style usage in many examples
If your code expects persisted state but your version is using a newer conversational API, align the example with your installed release.
4) Your app restarts between requests
In development, this is common with:
- •Flask debug reloader
- •FastAPI/Uvicorn
--reload - •Streamlit reruns on every interaction
- •Jupyter cell re-execution
If the process restarts, all in-memory state disappears.
uvicorn app:app --reload # restarts on file changes
flask --debug # can restart worker processes
streamlit run app.py # reruns script frequently
For debugging persistence issues, temporarily disable reload behavior and see if memory suddenly works.
How to Debug It
- •
Print object identity
- •Confirm whether you are creating a new agent each turn.
- •If
id(assistant)changes between requests, your memory will reset.
print("assistant id:", id(assistant)) - •
Inspect message length before each call
- •If
len(history)stays at 1, you are not persisting messages. - •A healthy conversation should grow over turns.
print("history size:", len(history)) - •If
- •
Check whether your server is restarting
- •Look for reload logs from Uvicorn/Flask/Streamlit.
- •Add a startup log and see whether it prints again on each request.
print("process started") - •
Reduce to one persistent script
- •Remove web framework code.
- •Run a plain Python file with one global agent and one global history list.
- •If it works there, the bug is in your framework lifecycle, not AutoGen.
Prevention
- •Create agents once at application startup, not inside request handlers.
- •Store conversation state explicitly in Redis, a database, or session storage if you need persistence across process restarts.
- •Treat notebook cells and auto-reload servers as hostile to in-memory state unless you’ve verified persistence end to end.
Keep learning
- •The complete AI Agents Roadmap — my full 8-step breakdown
- •Free: The AI Agent Starter Kit — PDF checklist + starter code
- •Work with me — I build AI for banks and insurance companies
By Cyprian Aarons, AI Consultant at Topiax.
Want the complete 8-step roadmap?
Grab the free AI Agent Starter Kit — architecture templates, compliance checklists, and a 7-email deep-dive course.
Get the Starter Kit