How to Fix 'memory not persisting when scaling' in LangGraph (Python)

By Cyprian AaronsUpdated 2026-04-21
memory-not-persisting-when-scalinglanggraphpython

When LangGraph memory stops persisting after you scale from one process to multiple workers, the problem is usually not the graph logic itself. It’s almost always a checkpointing or thread identity issue: each worker is writing to its own local state, or your app is starting a fresh thread on every request.

You’ll typically see behavior like this:

  • First message in a conversation works
  • Second request comes back “blank” or forgets prior state
  • It works on one pod, then fails once you add more replicas

The Most Common Cause

The #1 cause is using an in-memory checkpointer, or creating a new checkpointer per process. MemorySaver and other local-memory patterns work in a single Python process, but they do not persist across workers, pods, or restarts.

Here’s the broken pattern versus the fixed pattern.

BrokenFixed
MemorySaver() inside the app processShared persistent checkpointer like Postgres/Redis
New graph/checkpointer per workerOne durable store used by all workers
No stable thread_idSame thread_id for the same conversation
# BROKEN: state lives only in this Python process
from langgraph.checkpoint.memory import MemorySaver
from langgraph.graph import StateGraph, START, END

checkpointer = MemorySaver()

graph = builder.compile(checkpointer=checkpointer)

# Every worker has its own isolated memory.
result = graph.invoke(
    {"messages": [("user", "hello")]},
    config={"configurable": {"thread_id": "abc123"}}
)
# FIXED: use a persistent checkpointer shared across workers
from langgraph.checkpoint.postgres import PostgresSaver
from langgraph.graph import StateGraph

DB_URI = "postgresql://user:pass@postgres:5432/langgraph"

with PostgresSaver.from_conn_string(DB_URI) as checkpointer:
    graph = builder.compile(checkpointer=checkpointer)

    result = graph.invoke(
        {"messages": [("user", "hello")]},
        config={"configurable": {"thread_id": "abc123"}}
    )

If you’re deploying behind Gunicorn, Uvicorn workers, Kubernetes, ECS, or Cloud Run, this is the first thing to fix. A MemorySaver checkpoint stored in worker A will never be visible to worker B.

Other Possible Causes

1) You are not passing a stable thread_id

LangGraph uses thread_id to load the correct checkpoint. If you generate a new ID every request, persistence will look broken even with a real database.

# BROKEN: new thread every request
graph.invoke(
    input_data,
    config={"configurable": {"thread_id": str(uuid.uuid4())}}
)
# FIXED: reuse the same conversation/thread identifier
graph.invoke(
    input_data,
    config={"configurable": {"thread_id": conversation_id}}
)

2) You compiled the graph inside the request handler

Compiling per request can hide state bugs and create inconsistent runtime behavior. The graph should usually be built once at startup.

# BROKEN
@app.post("/chat")
def chat(req: ChatRequest):
    graph = builder.compile(checkpointer=checkpointer)
    return graph.invoke(req.input, config=req.config)
# FIXED
graph = builder.compile(checkpointer=checkpointer)

@app.post("/chat")
def chat(req: ChatRequest):
    return graph.invoke(req.input, config=req.config)

3) Your database is not actually shared by all replicas

This shows up when each container points to localhost or an ephemeral volume. One pod writes checkpoints; another pod reads from a different place.

# BROKEN Kubernetes env example
env:
  - name: DATABASE_URL
    value: postgresql://user:pass@localhost:5432/langgraph
# FIXED Kubernetes env example
env:
  - name: DATABASE_URL
    valueFrom:
      secretKeyRef:
        name: langgraph-db-secret
        key: DATABASE_URL

If your URL says localhost, assume it is wrong unless Postgres is running in the same container.

4) You are mixing sync and async incorrectly

If you use async graphs with sync invocations, or forget to await async persistence calls in surrounding code, checkpoints may never flush when expected.

# BROKEN: async graph path used incorrectly elsewhere in app
result = await graph.ainvoke(input_data, config=config)
# FIXED: keep invocation style consistent end-to-end
result = await graph.ainvoke(input_data, config=config)
# and ensure your route/function is async too

Also make sure your checkpointer supports the async path you’re using. Don’t mix APIs casually.

How to Debug It

  1. Confirm which checkpointer you compiled with

    • Log the class name at startup.
    • If you see MemorySaver, that’s your answer.
    • Example:
      print(type(checkpointer).__name__)
      
  2. Verify the same thread_id is reused

    • Log it on every request.
    • Send two requests with the exact same ID and compare results.
    • If state resets between requests, your client is generating new IDs.
  3. Check whether all workers point at the same store

    • Inspect environment variables on every replica.
    • Confirm DB hostnames are identical and reachable.
    • In Kubernetes, exec into two pods and compare DATABASE_URL.
  4. Inspect checkpoints directly

    • Query your backing store for saved threads/checkpoints.
    • If nothing is being written, your persistence layer isn’t configured correctly.
    • If writes exist but reads don’t match, your thread mapping is wrong.

Prevention

  • Use a durable checkpointer in any multi-worker deployment:

    • PostgresSaver
    • Redis-based persistence if that fits your architecture better
  • Treat thread_id as part of your API contract:

    • stable per user session/conversation
    • never random per request
  • Initialize LangGraph once at process startup:

    • build nodes once
    • compile once
    • reuse across requests

If you want one sentence to remember this by: LangGraph memory only persists across scale when both the checkpointer and thread_id are stable across workers.


Keep learning

By Cyprian Aarons, AI Consultant at Topiax.

Want the complete 8-step roadmap?

Grab the free AI Agent Starter Kit — architecture templates, compliance checklists, and a 7-email deep-dive course.

Get the Starter Kit

Related Guides