How to Fix 'memory not persisting during development' in LangChain (Python)
When LangChain memory “doesn’t persist during development,” it usually means your chain runs, but the conversation state resets on every request, reload, or notebook cell execution. In practice, this shows up when you’re using ConversationBufferMemory, ConversationSummaryMemory, or a custom memory object, and the agent behaves like it has never seen the previous turn.
The key point: LangChain memory is not durable storage unless you wire it to something persistent. If you recreate the chain, reinitialize the memory object, or run in a stateless server process, the buffer is gone.
The Most Common Cause
The #1 cause is instantiating memory inside the request handler or inside code that runs on every reload. That creates a fresh ConversationBufferMemory each time, so nothing survives between calls.
Here’s the broken pattern:
from langchain.memory import ConversationBufferMemory
from langchain_openai import ChatOpenAI
from langchain.chains import ConversationChain
def chat(user_input: str):
llm = ChatOpenAI(model="gpt-4o-mini")
memory = ConversationBufferMemory()
chain = ConversationChain(llm=llm, memory=memory)
return chain.invoke({"input": user_input})
And here’s the fixed pattern:
from langchain.memory import ConversationBufferMemory
from langchain_openai import ChatOpenAI
from langchain.chains import ConversationChain
llm = ChatOpenAI(model="gpt-4o-mini")
memory = ConversationBufferMemory()
chain = ConversationChain(llm=llm, memory=memory)
def chat(user_input: str):
return chain.invoke({"input": user_input})
| Broken | Fixed |
|---|---|
memory = ConversationBufferMemory() inside chat() | memory created once at module scope |
New ConversationChain per request | Reuse the same chain instance |
| State resets every call | State persists for the lifetime of the process |
If you’re seeing behavior like this in logs:
- •
ConversationBufferMemoryreturns an empty history on every turn - •The agent repeats “I don’t have access to previous messages”
- •Debug output shows a new chain object each request
then this is almost certainly your issue.
Other Possible Causes
1) You’re using a stateless server or auto-reloading dev server
If you run Flask with debug reload, Uvicorn with multiple workers, or any server that restarts processes frequently, in-memory state disappears.
# Bad for persistence during dev
uvicorn app:app --reload --workers 2
Fix:
uvicorn app:app --reload --workers 1
Better yet, move memory to Redis or a database-backed store if you need persistence across restarts.
2) You’re mixing up memory_key and prompt variables
LangChain memory injects history into a specific variable name. If your prompt expects chat_history but your memory writes to history, it looks like memory is broken.
from langchain.memory import ConversationBufferMemory
# Broken if your prompt uses {chat_history}
memory = ConversationBufferMemory(memory_key="history")
Fix it by matching names:
memory = ConversationBufferMemory(memory_key="chat_history")
Your prompt should include the same placeholder:
prompt_template = """Chat history:
{chat_history}
Human: {input}
AI:"""
3) You forgot to pass session-scoped storage in LangGraph or custom wrappers
If you built an agent wrapper around LangChain and store state in a local variable, each session gets wiped.
# Broken: local dict resets per call
def get_memory():
return {}
Use a session map keyed by user/session ID:
sessions = {}
def get_memory(session_id: str):
if session_id not in sessions:
sessions[session_id] = []
return sessions[session_id]
For production, replace that with Redis or Postgres. A Python dict only persists while the process lives.
4) You’re expecting ConversationBufferWindowMemory to keep everything
ConversationBufferWindowMemory only keeps the last k turns. If you set k=1, older messages disappear by design.
from langchain.memory import ConversationBufferWindowMemory
memory = ConversationBufferWindowMemory(k=1)
If you need full history during debugging:
from langchain.memory import ConversationBufferMemory
memory = ConversationBufferMemory()
How to Debug It
- •
Print the chain and memory object IDs
- •If they change between requests, you are recreating them.
print("chain:", id(chain)) print("memory:", id(memory)) - •
Inspect what LangChain thinks is in memory
- •For classic memories:
print(memory.load_memory_variables({}))- •If this returns
{ 'chat_history': '' }every time, nothing is being saved.
- •
Check your prompt variable names
- •Verify that
memory_key, prompt placeholders, and chain inputs match exactly. - •A mismatch won’t always throw an exception; it just silently breaks context injection.
- •Verify that
- •
Run without reload/multiple workers
- •Disable hot reload and reduce worker count.
- •If persistence starts working, your issue is process lifetime, not LangChain logic.
Prevention
- •Create memory and chain once per session, not once per request.
- •Use persistent storage for anything beyond local debugging:
- •Redis for session state
- •Postgres for audit-friendly conversation logs
- •SQLite if you want simple local persistence during development
- •Keep these three names aligned:
- •prompt variable name
- •
memory_key - •input/output keys on the chain
If you want a rule of thumb: if your app can restart and still needs to remember conversations, plain ConversationBufferMemory is not enough. Use persistent backing storage or a session-aware architecture from day one.
Keep learning
- •The complete AI Agents Roadmap — my full 8-step breakdown
- •Free: The AI Agent Starter Kit — PDF checklist + starter code
- •Work with me — I build AI for banks and insurance companies
By Cyprian Aarons, AI Consultant at Topiax.
Want the complete 8-step roadmap?
Grab the free AI Agent Starter Kit — architecture templates, compliance checklists, and a 7-email deep-dive course.
Get the Starter Kit