How to Fix 'memory not persisting in production' in LangChain (TypeScript)

By Cyprian AaronsUpdated 2026-04-21

memory-not-persisting-in-productionlangchaintypescript

If your LangChain memory works locally but resets in production, you are usually not dealing with a LangChain bug. You are dealing with process lifecycle, missing persistence, or the wrong memory abstraction for a stateless deployment.

This shows up a lot in TypeScript apps deployed to serverless, containers, or multi-instance Node backends. The symptom is simple: the chain answers as if every request is the first one.

The Most Common Cause

The #1 cause is using in-memory memory classes like BufferMemory or ConversationBufferMemory and expecting them to survive across requests or cold starts.

These classes only persist inside the running Node process. In production, that process may restart, scale horizontally, or disappear after each request.

Broken vs fixed pattern

Broken pattern	Fixed pattern
`BufferMemory` stored in module scope and reused across requests	Persist conversation state in Redis/Postgres/etc. and hydrate memory per request
Assumes one Node process handles all traffic	Treats app as stateless

// ❌ Broken: memory lives only in this process
import { BufferMemory } from "langchain/memory";
import { ConversationChain } from "langchain/chains";
import { ChatOpenAI } from "@langchain/openai";

const llm = new ChatOpenAI({ model: "gpt-4o-mini" });
const memory = new BufferMemory(); // resets on restart / new instance

const chain = new ConversationChain({
  llm,
  memory,
});

export async function chat(message: string) {
  const res = await chain.invoke({ input: message });
  return res.response;
}

// ✅ Fixed: persist history outside the process
import { ChatOpenAI } from "@langchain/openai";
import { RunnableWithMessageHistory } from "@langchain/core/runnables";
import { ChatMessageHistory } from "@langchain/community/stores/message/in_memory"; // replace with Redis/Postgres in prod
import { AIMessage, HumanMessage } from "@langchain/core/messages";

const llm = new ChatOpenAI({ model: "gpt-4o-mini" });

// Example store; swap for Redis-backed storage in production
const histories = new Map<string, ChatMessageHistory>();

function getSessionHistory(sessionId: string) {
  if (!histories.has(sessionId)) {
    histories.set(sessionId, new ChatMessageHistory());
  }
  return histories.get(sessionId)!;
}

const chain = RunnableWithMessageHistory.from(
  async (messages) => llm.invoke(messages),
  {
    getMessageHistory: async (sessionId) => getSessionHistory(sessionId),
    inputMessagesKey: "input",
    historyMessagesKey: "history",
  }
);

export async function chat(sessionId: string, message: string) {
  return chain.invoke(
    { input: [new HumanMessage(message)] },
    { configurable: { sessionId } }
  );
}

If you are on serverless or Kubernetes, use Redis, DynamoDB, Postgres, or another external store. A Map is only useful for local debugging.

Other Possible Causes

1. You are creating a new chain on every request

If the chain and memory are instantiated inside the handler, the history may be recreated each time.

// ❌ New memory every request
export async function handler(req: Request) {
  const memory = new BufferMemory();
  const chain = new ConversationChain({ llm, memory });
  return chain.invoke({ input: req.body.message });
}

// ✅ Reuse a persistent store keyed by session/user id
const historyStore = new Map<string, string[]>();

export async function handler(req: Request) {
  const sessionId = req.headers.get("x-session-id")!;
  const history = historyStore.get(sessionId) ?? [];
  // load history into your memory implementation here
}

2. Your deployment is stateless and scaled horizontally

Two requests from the same user may hit different pods or Lambda instances. In that case BufferMemory will look random in production.

# Example symptom:
# pod-a sees conversation state
# pod-b has an empty memory object

replicas: 3

Fix:

•Store chat history externally.
•Use sticky sessions only as a temporary workaround.
•Do not rely on process-local state.

3. You are not passing the same session identifier

A lot of “memory not persisting” bugs are really “wrong key” bugs.

// ❌ Session id changes every request
await chain.invoke(
  { input: message },
  { configurable: { sessionId: crypto.randomUUID() } }
);

// ✅ Stable session id from auth/user context/cookie
await chain.invoke(
  { input: message },
  { configurable: { sessionId: user.id } }
);

If you generate a fresh ID per call, you have created a brand-new conversation every time.

4. You are using the wrong API for your LangChain version

LangChain TypeScript has changed a lot across versions. Code written for older ConversationChain patterns often breaks when moved to newer runnable-based APIs.

Common signs:

•ConversationBufferMemory exists but never gets read back.
•loadMemoryVariables() runs locally but not in your request path.
•
You see messages like:
- •Error: Missing value for input key
- •TypeError: Cannot read properties of undefined
- •Expected MessagesPlaceholder "history" to be provided

Check whether you should be using:

•RunnableWithMessageHistory
•MessagesPlaceholder
•a proper message history store

How to Debug It

•
Print the session ID on every request
- •If it changes between messages, that is your bug.
- •Log it alongside user ID and deployment instance name.

•

Log what memory contains before and after invocation

console.log("before", await memory.loadMemoryVariables({}));
const result = await chain.invoke({ input: message });
console.log("after", await memory.loadMemoryVariables({}));

If it resets immediately after one call, you are recreating it or losing process state.

•
Check where the code runs
- •Local dev with one Node process can hide production issues.
- •Verify whether you are on Lambda, Vercel functions, multiple pods, or PM2 cluster mode.
•
Trace the storage layer
- •If using Redis/Postgres/DynamoDB, inspect actual writes.
- •Confirm writes happen under the same key you later read from.
- •Look for cache TTLs expiring too early.

Prevention

•Use external persistence for any conversation state that must survive beyond one process.
•Key all chat history by stable identifiers like userId, conversationId, or both.
•Prefer RunnableWithMessageHistory over ad hoc global variables when building TypeScript chat systems.
•Add an integration test that starts two requests against separate instances and verifies history still loads.

If your production app depends on conversation continuity, treat memory like data storage, not like an object in RAM. That is the real fix behind most LangChain “memory not persisting” reports.

Keep learning

•The complete AI Agents Roadmap — my full 8-step breakdown
•Free: The AI Agent Starter Kit — PDF checklist + starter code
•Work with me — I build AI for banks and insurance companies

By Cyprian Aarons, AI Consultant at Topiax.

ShareX / Twitter LinkedIn

Want the complete 8-step roadmap?

Grab the free AI Agent Starter Kit — architecture templates, compliance checklists, and a 7-email deep-dive course.

Get the Starter Kit