How to Fix 'memory not persisting in production' in LangGraph (TypeScript)

By Cyprian AaronsUpdated 2026-04-21
memory-not-persisting-in-productionlanggraphtypescript

What the error means

If your LangGraph app works locally but memory disappears in production, you’re usually not actually persisting state between requests. In TypeScript, this shows up when you use an in-memory checkpointer, forget to pass a stable thread_id, or deploy to a serverless/runtime model that resets process memory.

The common symptom is: the first turn works, the second turn behaves like a fresh conversation. In logs, you may also see LangGraph state being rebuilt without any prior checkpoint data.

The Most Common Cause

The #1 cause is using an ephemeral checkpointer in production, usually MemorySaver, or not wiring a real persistence layer at all.

MemorySaver is fine for local tests. It is not durable across restarts, multiple instances, or serverless invocations.

Broken vs fixed pattern

BrokenFixed
Uses MemorySaver in productionUses a durable checkpointer like PostgreSQL/Redis-backed storage
Recreates graph per request with no shared persistenceReuses the same compiled graph and stable checkpointer
No stable thread_id in configurablePasses a consistent thread_id per user/session
// BROKEN: memory disappears on restart or across instances
import { StateGraph, MemorySaver } from "@langchain/langgraph";

const graph = new StateGraph(MyState)
  .addNode("agent", agentNode)
  .addEdge("__start__", "agent")
  .addEdge("agent", "__end__")
  .compile({
    checkpointer: new MemorySaver(),
  });

await graph.invoke(
  { messages: [{ role: "user", content: "Remember this" }] },
  // Missing configurable.thread_id makes persistence unusable across turns
);
// FIXED: durable persistence + stable thread_id
import { StateGraph } from "@langchain/langgraph";
// Example checkpointer package depends on your storage choice
import { PostgresSaver } from "@langchain/langgraph-checkpoint-postgres";
import { Pool } from "pg";

const pool = new Pool({
  connectionString: process.env.POSTGRES_URL,
});

const checkpointer = await PostgresSaver.fromPool(pool);

const graph = new StateGraph(MyState)
  .addNode("agent", agentNode)
  .addEdge("__start__", "agent")
  .addEdge("agent", "__end__")
  .compile({
    checkpointer,
  });

await graph.invoke(
  { messages: [{ role: "user", content: "Remember this" }] },
  {
    configurable: {
      thread_id: "user_123_conversation_456",
    },
  }
);

If you only fix one thing, fix this. Most “memory not persisting” bugs are just “no durable checkpoint store” bugs.

Other Possible Causes

1) You are not passing thread_id consistently

LangGraph uses the thread identifier to load the right checkpoint. If every request gets a new ID, you’ll always get a fresh state.

// BAD
await graph.invoke(input, {
  configurable: { thread_id: crypto.randomUUID() },
});

// GOOD
await graph.invoke(input, {
  configurable: { thread_id: session.userId }, // or conversationId
});

If you’re behind an API gateway or load balancer, make sure the ID comes from your app domain, not process-local data.

2) You compiled the graph without a checkpointer

This often happens when the app works as a stateless agent and later you add memory expectations.

// BAD
const app = graph.compile(); // no checkpointer

// GOOD
const app = graph.compile({
  checkpointer,
});

Without a checkpointer, LangGraph can execute state transitions, but it has nowhere to save them for later runs.

3) Your deployment resets process memory

This is common on Vercel functions, AWS Lambda, Cloud Run autoscaling, or multiple Node replicas. Anything stored in module scope will vanish or diverge across instances.

// BAD: module-level memory is not durable in serverless/multi-instance setups
const sessions = new Map<string, unknown>();

export async function handler(req: Request) {
  sessions.set("last_state", req.body);
}

Use external storage for checkpoints:

// GOOD: persisted outside the process
const checkpointer = await PostgresSaver.fromPool(pool);
const app = graph.compile({ checkpointer });

4) Your state schema does not include the fields you expect to persist

LangGraph only persists what’s part of the state. If your reducer/state type omits messages or custom fields, they won’t survive as expected.

type State = {
  // BAD if you expect chat history but never define it properly
};

type GoodState = {
  messages: Array<{ role: string; content: string }>;
};

Also make sure your node returns updates in the shape your reducers expect:

return {
  messages: [...state.messages, newMessage],
};

If you return the wrong key name or mutate state incorrectly, it can look like persistence failed when the real issue is bad state updates.

How to Debug It

  1. Check whether your app uses a durable checkpointer

    • Search for MemorySaver.
    • If that’s what you’re using in prod, that’s the bug.
    • Confirm your compiled app has checkpointer.
  2. Log the exact config passed into invoke

    • Verify configurable.thread_id exists and stays stable.
    • Log it once per request:
      console.log("thread_id:", config.configurable?.thread_id);
      
  3. Inspect whether checkpoints are actually written

    • Query your checkpoint store directly.
    • If using Postgres/Redis and nothing is being stored, the issue is upstream of retrieval.
    • If writes exist but reads don’t match, your thread_id mapping is wrong.
  4. Confirm deployment behavior

    • Check whether requests hit multiple instances.
    • If local works and prod fails under load balancer traffic, assume process memory is unreliable until proven otherwise.
    • Test by restarting one instance and seeing if history survives.

Prevention

  • Use a real external checkpoint store in any environment where the process can restart or scale horizontally.
  • Treat thread_id like a primary key:
    • stable per user/session/conversation
    • never random per request
  • Add an integration test that:
    • invokes once,
    • restarts the app or creates a new request context,
    • invokes again,
    • asserts prior state is still present.

If you want one rule to remember: LangGraph memory persists through checkpoints, not vibes. When production loses memory, start by checking the store and the thread ID before digging into model logic.


Keep learning

By Cyprian Aarons, AI Consultant at Topiax.

Want the complete 8-step roadmap?

Grab the free AI Agent Starter Kit — architecture templates, compliance checklists, and a 7-email deep-dive course.

Get the Starter Kit

Related Guides