How to Fix 'state not updating in production' in LangChain (Python)

By Cyprian AaronsUpdated 2026-04-21
state-not-updating-in-productionlangchainpython

If your LangChain app works locally but state not updating in production shows up once it’s deployed, you’re usually dealing with a state-management bug, not a LangChain bug. The common pattern is: your chain or agent writes to state in memory, but production runs across multiple requests, workers, or processes, so the update never lands where the next step expects it.

This shows up a lot with RunnableWithMessageHistory, AgentExecutor, custom callback handlers, and any setup that assumes a single Python process.

The Most Common Cause

The #1 cause is using in-memory state in a stateless production runtime.

Locally, this often “works” because one process handles the whole request flow. In production, your app may run behind Gunicorn/Uvicorn workers, on serverless functions, or across multiple containers. Each instance has its own memory, so updates disappear between calls.

Broken vs fixed pattern

Broken patternFixed pattern
Store conversation state in a Python dictStore state in Redis/Postgres/etc.
Assume one worker handles all requestsPass a stable session_id and persist history
Mutate local objects inside the chainWrite state through a shared backend
# BROKEN: in-memory history disappears across workers/processes
from langchain_core.runnables.history import RunnableWithMessageHistory
from langchain_core.chat_history import InMemoryChatMessageHistory
from langchain_openai import ChatOpenAI

store = {}

def get_session_history(session_id: str):
    if session_id not in store:
        store[session_id] = InMemoryChatMessageHistory()
    return store[session_id]

llm = ChatOpenAI(model="gpt-4o-mini")
chain = RunnableWithMessageHistory(
    llm,
    get_session_history,
    input_messages_key="input",
    history_messages_key="history",
)

# Works locally, breaks when requests hit different workers
result = chain.invoke(
    {"input": "My name is Sam"},
    config={"configurable": {"session_id": "abc123"}},
)
# FIXED: persist history outside process memory
from langchain_core.runnables.history import RunnableWithMessageHistory
from langchain_community.chat_message_histories import RedisChatMessageHistory
from langchain_openai import ChatOpenAI

def get_session_history(session_id: str):
    return RedisChatMessageHistory(
        session_id=session_id,
        url="redis://localhost:6379/0",
        key_prefix="chat:"
    )

llm = ChatOpenAI(model="gpt-4o-mini")
chain = RunnableWithMessageHistory(
    llm,
    get_session_history,
    input_messages_key="input",
    history_messages_key="history",
)

result = chain.invoke(
    {"input": "My name is Sam"},
    config={"configurable": {"session_id": "abc123"}},
)

If you’re seeing behavior like:

  • history is empty on the next request
  • tool output appears once and then vanishes
  • agent memory resets after deployment

then this is almost always the root cause.

Other Possible Causes

1) Missing or unstable session_id

If your code uses RunnableWithMessageHistory, the session key must be stable across requests. If you generate a new UUID every time, you are creating a new conversation every time.

# BAD
config={"configurable": {"session_id": str(uuid.uuid4())}}

# GOOD
config={"configurable": {"session_id": user.id}}

If the user ID changes per request, fix that first.

2) You’re mutating state inside a tool instead of returning it

Tools should return data. Don’t rely on mutating outer variables and expecting LangChain to preserve them.

# BROKEN
state = {"customer_tier": None}

def set_tier(tier: str):
    state["customer_tier"] = tier
    return f"tier set to {tier}"
# FIXED
def set_tier(tier: str):
    return {"customer_tier": tier}

Then write that result into your durable store explicitly.

3) Callback handlers are local-only

A custom BaseCallbackHandler can log tokens or trace events, but it is not a persistence layer. If you use callbacks to “save” state, that state lives only for the lifetime of the process.

from langchain_core.callbacks import BaseCallbackHandler

class MyHandler(BaseCallbackHandler):
    def on_chain_end(self, outputs, **kwargs):
        print("Saving outputs:", outputs)  # logging only, not persistence

Use callbacks for observability, not storage.

4) Async race conditions overwrite newer state

In production, two requests can hit the same session at once. If both read old state and write back later, the last write wins and one update disappears.

# Example symptom: concurrent writes clobber each other
await save_state(session_id, old_state + new_message)

Fix this with:

  • optimistic locking
  • database transactions
  • Redis locks if needed
  • append-only message storage instead of full-object overwrite

How to Debug It

  1. Print the exact session key being used

    • Log session_id, user ID, tenant ID, and request ID.
    • Confirm they are identical across requests for the same conversation.
  2. Check whether your storage is process-local

    • If you use InMemoryChatMessageHistory, plain dicts, module globals, or class attributes in app code, assume they break under multiple workers.
    • Verify whether deployment uses Gunicorn workers, Kubernetes replicas, or serverless invocations.
  3. Inspect the actual LangChain object flow

    • Add logs around RunnableWithMessageHistory, AgentExecutor, or custom chains.
    • Confirm that the updated messages/state are being returned from the runnable and written to persistent storage afterward.
  4. Reproduce with two parallel requests

    • Send two requests for the same session at nearly the same time.
    • If one update disappears, you have a concurrency problem rather than a LangChain formatting issue.

Prevention

  • Use durable storage for conversation state:

    • Redis for short-lived chat history
    • Postgres for audit-friendly persistence
    • S3/object storage for long-running artifacts
  • Treat LangChain runnables as stateless:

    • pass input in
    • return output out
    • persist side effects explicitly
  • Add deployment checks:

    • fail builds if code uses InMemoryChatMessageHistory in production paths
    • log session IDs and worker IDs so cross-process bugs are obvious

If you want one rule to keep in mind: if it matters after the current function returns, don’t keep it in Python memory. That’s where most state not updating in production issues start in LangChain apps.


Keep learning

By Cyprian Aarons, AI Consultant at Topiax.

Want the complete 8-step roadmap?

Grab the free AI Agent Starter Kit — architecture templates, compliance checklists, and a 7-email deep-dive course.

Get the Starter Kit

Related Guides