How to Fix 'state not updating in production' in LangChain (Python)
If your LangChain app works locally but state not updating in production shows up once it’s deployed, you’re usually dealing with a state-management bug, not a LangChain bug. The common pattern is: your chain or agent writes to state in memory, but production runs across multiple requests, workers, or processes, so the update never lands where the next step expects it.
This shows up a lot with RunnableWithMessageHistory, AgentExecutor, custom callback handlers, and any setup that assumes a single Python process.
The Most Common Cause
The #1 cause is using in-memory state in a stateless production runtime.
Locally, this often “works” because one process handles the whole request flow. In production, your app may run behind Gunicorn/Uvicorn workers, on serverless functions, or across multiple containers. Each instance has its own memory, so updates disappear between calls.
Broken vs fixed pattern
| Broken pattern | Fixed pattern |
|---|---|
| Store conversation state in a Python dict | Store state in Redis/Postgres/etc. |
| Assume one worker handles all requests | Pass a stable session_id and persist history |
| Mutate local objects inside the chain | Write state through a shared backend |
# BROKEN: in-memory history disappears across workers/processes
from langchain_core.runnables.history import RunnableWithMessageHistory
from langchain_core.chat_history import InMemoryChatMessageHistory
from langchain_openai import ChatOpenAI
store = {}
def get_session_history(session_id: str):
if session_id not in store:
store[session_id] = InMemoryChatMessageHistory()
return store[session_id]
llm = ChatOpenAI(model="gpt-4o-mini")
chain = RunnableWithMessageHistory(
llm,
get_session_history,
input_messages_key="input",
history_messages_key="history",
)
# Works locally, breaks when requests hit different workers
result = chain.invoke(
{"input": "My name is Sam"},
config={"configurable": {"session_id": "abc123"}},
)
# FIXED: persist history outside process memory
from langchain_core.runnables.history import RunnableWithMessageHistory
from langchain_community.chat_message_histories import RedisChatMessageHistory
from langchain_openai import ChatOpenAI
def get_session_history(session_id: str):
return RedisChatMessageHistory(
session_id=session_id,
url="redis://localhost:6379/0",
key_prefix="chat:"
)
llm = ChatOpenAI(model="gpt-4o-mini")
chain = RunnableWithMessageHistory(
llm,
get_session_history,
input_messages_key="input",
history_messages_key="history",
)
result = chain.invoke(
{"input": "My name is Sam"},
config={"configurable": {"session_id": "abc123"}},
)
If you’re seeing behavior like:
- •
historyis empty on the next request - •tool output appears once and then vanishes
- •agent memory resets after deployment
then this is almost always the root cause.
Other Possible Causes
1) Missing or unstable session_id
If your code uses RunnableWithMessageHistory, the session key must be stable across requests. If you generate a new UUID every time, you are creating a new conversation every time.
# BAD
config={"configurable": {"session_id": str(uuid.uuid4())}}
# GOOD
config={"configurable": {"session_id": user.id}}
If the user ID changes per request, fix that first.
2) You’re mutating state inside a tool instead of returning it
Tools should return data. Don’t rely on mutating outer variables and expecting LangChain to preserve them.
# BROKEN
state = {"customer_tier": None}
def set_tier(tier: str):
state["customer_tier"] = tier
return f"tier set to {tier}"
# FIXED
def set_tier(tier: str):
return {"customer_tier": tier}
Then write that result into your durable store explicitly.
3) Callback handlers are local-only
A custom BaseCallbackHandler can log tokens or trace events, but it is not a persistence layer. If you use callbacks to “save” state, that state lives only for the lifetime of the process.
from langchain_core.callbacks import BaseCallbackHandler
class MyHandler(BaseCallbackHandler):
def on_chain_end(self, outputs, **kwargs):
print("Saving outputs:", outputs) # logging only, not persistence
Use callbacks for observability, not storage.
4) Async race conditions overwrite newer state
In production, two requests can hit the same session at once. If both read old state and write back later, the last write wins and one update disappears.
# Example symptom: concurrent writes clobber each other
await save_state(session_id, old_state + new_message)
Fix this with:
- •optimistic locking
- •database transactions
- •Redis locks if needed
- •append-only message storage instead of full-object overwrite
How to Debug It
- •
Print the exact session key being used
- •Log
session_id, user ID, tenant ID, and request ID. - •Confirm they are identical across requests for the same conversation.
- •Log
- •
Check whether your storage is process-local
- •If you use
InMemoryChatMessageHistory, plain dicts, module globals, or class attributes in app code, assume they break under multiple workers. - •Verify whether deployment uses Gunicorn workers, Kubernetes replicas, or serverless invocations.
- •If you use
- •
Inspect the actual LangChain object flow
- •Add logs around
RunnableWithMessageHistory,AgentExecutor, or custom chains. - •Confirm that the updated messages/state are being returned from the runnable and written to persistent storage afterward.
- •Add logs around
- •
Reproduce with two parallel requests
- •Send two requests for the same session at nearly the same time.
- •If one update disappears, you have a concurrency problem rather than a LangChain formatting issue.
Prevention
- •
Use durable storage for conversation state:
- •Redis for short-lived chat history
- •Postgres for audit-friendly persistence
- •S3/object storage for long-running artifacts
- •
Treat LangChain runnables as stateless:
- •pass input in
- •return output out
- •persist side effects explicitly
- •
Add deployment checks:
- •fail builds if code uses
InMemoryChatMessageHistoryin production paths - •log session IDs and worker IDs so cross-process bugs are obvious
- •fail builds if code uses
If you want one rule to keep in mind: if it matters after the current function returns, don’t keep it in Python memory. That’s where most state not updating in production issues start in LangChain apps.
Keep learning
- •The complete AI Agents Roadmap — my full 8-step breakdown
- •Free: The AI Agent Starter Kit — PDF checklist + starter code
- •Work with me — I build AI for banks and insurance companies
By Cyprian Aarons, AI Consultant at Topiax.
Want the complete 8-step roadmap?
Grab the free AI Agent Starter Kit — architecture templates, compliance checklists, and a 7-email deep-dive course.
Get the Starter Kit