How to Fix 'memory not persisting in production' in LangGraph (Python)

By Cyprian AaronsUpdated 2026-04-21
memory-not-persisting-in-productionlanggraphpython

What this error usually means

If your LangGraph agent works locally but “forgets” state in production, the issue is almost always with checkpointing or thread identity. In practice, you’ll see symptoms like:

  • a new conversation starts on every request
  • MessagesState is empty after the first turn
  • the graph runs, but prior messages never reappear

The most common runtime clue is that you’re using MemorySaver in a place where it can’t persist across process restarts, or you’re invoking the graph without a stable thread_id.

The Most Common Cause

The #1 cause is this pattern: you created a checkpointer, but your production deployment does not keep process memory alive.

MemorySaver is an in-memory checkpointer. It works for local testing, notebooks, and single-process dev servers. It does not survive container restarts, autoscaling, multiple workers, or serverless cold starts.

Broken vs fixed

Broken patternRight pattern
Uses MemorySaver() in productionUses a persistent checkpointer
No stable thread_idPasses a consistent thread_id per user/session
State disappears after restartState survives restarts
# BROKEN: memory only lives inside this Python process
from langgraph.checkpoint.memory import MemorySaver
from langgraph.graph import StateGraph, MessagesState, START

builder = StateGraph(MessagesState)
builder.add_node("chat", chat_node)
builder.add_edge(START, "chat")
builder.add_edge("chat", END)

graph = builder.compile(checkpointer=MemorySaver())

# This may work locally, then fail in production after restart / scale-out
result = graph.invoke(
    {"messages": [{"role": "user", "content": "Hello"}]},
    config={"configurable": {"thread_id": "user-123"}}
)
# FIXED: use a persistent checkpointer (example: Postgres)
from langgraph.checkpoint.postgres import PostgresSaver
from langgraph.graph import StateGraph, MessagesState, START

builder = StateGraph(MessagesState)
builder.add_node("chat", chat_node)
builder.add_edge(START, "chat")
builder.add_edge("chat", END)

with PostgresSaver.from_conn_string(DB_URL) as checkpointer:
    checkpointer.setup()
    graph = builder.compile(checkpointer=checkpointer)

    result = graph.invoke(
        {"messages": [{"role": "user", "content": "Hello"}]},
        config={"configurable": {"thread_id": "user-123"}}
    )

If you’re deploying to Kubernetes, ECS, Cloud Run, or serverless Python functions, this is usually the whole problem. In-memory state is not persistence.

Other Possible Causes

1) You forgot to pass thread_id

LangGraph uses thread_id to look up prior state. If every request gets a new ID, you’ve effectively told the graph to start fresh every time.

# Broken
graph.invoke(input_data)

# Fixed
graph.invoke(
    input_data,
    config={"configurable": {"thread_id": "acct-48291"}}
)

Use a real conversation key:

  • user ID
  • account ID
  • case ID
  • session ID from your app layer

2) Your deployment has multiple workers and each one has its own memory

This shows up with Gunicorn/Uvicorn workers or multiple pods. One request hits worker A, the next hits worker B. Each worker has its own MemorySaver, so state appears random.

# Bad for in-memory checkpointing
gunicorn app:app --workers 4

If you must run multiple workers:

  • use a persistent checkpointer like Postgres or Redis-backed storage
  • do not rely on process-local memory for conversation state

3) You are recreating the graph on every request with fresh state assumptions

Rebuilding the graph per request is fine. Rebuilding it with an ephemeral checkpointer is not.

# Broken if paired with MemorySaver and multi-worker deployment
def get_graph():
    builder = StateGraph(MessagesState)
    builder.add_node("chat", chat_node)
    return builder.compile(checkpointer=MemorySaver())

Better:

# Keep durable storage outside the request path
checkpointer = PostgresSaver.from_conn_string(DB_URL)
checkpointer.setup()

def get_graph():
    builder = StateGraph(MessagesState)
    builder.add_node("chat", chat_node)
    return builder.compile(checkpointer=checkpointer)

4) You are mixing up state and message history

A common mistake is expecting arbitrary Python variables to persist between turns. LangGraph persists checkpointed state, not local variables inside your node function.

# Broken: local variable resets on every run
def chat_node(state):
    seen = []
    seen.append(state["messages"][-1].content)
    return {"messages": []}

Store what matters in graph state:

# Fixed: use the graph state itself
def chat_node(state):
    last_msg = state["messages"][-1].content
    return {"messages": [{"role": "assistant", "content": f"You said: {last_msg}"}]}

How to Debug It

  1. Check whether you are using MemorySaver

    • If yes and this is production, assume that’s the bug.
    • Search for:
      • from langgraph.checkpoint.memory import MemorySaver
      • checkpointer=MemorySaver()
  2. Verify that every call includes the same thread_id

    • Log it at the API boundary.
    • If it changes between requests, LangGraph will load a different thread checkpoint.
  3. Inspect whether your app runs with more than one process

    • Look at:
      • Gunicorn workers
      • Uvicorn workers
      • Kubernetes replicas
      • serverless invocations
    • If yes and you’re using in-memory storage, persistence will fail by design.
  4. Read back checkpoints directly

    • With persistent storage, confirm data exists after each turn.
    • If nothing is stored, your issue is wiring.
    • If data exists but isn’t loaded, your issue is usually thread_id mismatch.

A useful symptom map:

SymptomLikely cause
Works locally onlyMemorySaver in prod
First message persists, second doesn’tMissing or changing thread_id
Random behavior across requestsMultiple workers/pods with local memory
Checkpoints exist but aren’t usedWrong config path or thread identity

Prevention

  • Use a persistent checkpointer from day one in any deployed agent:

    • Postgres for standard web apps
    • Redis if you already operate it as shared infra and understand its durability tradeoffs
  • Treat thread_id as part of your API contract.

    • Generate it once per conversation/session.
    • Never let clients invent new IDs on every request.
  • Add an integration test that runs two consecutive invocations against the same thread.

    • First turn writes state.
    • Second turn must read it back.
    • Run that test against the same storage backend you use in production.

Keep learning

By Cyprian Aarons, AI Consultant at Topiax.

Want the complete 8-step roadmap?

Grab the free AI Agent Starter Kit — architecture templates, compliance checklists, and a 7-email deep-dive course.

Get the Starter Kit

Related Guides