How to Fix 'memory not persisting when scaling' in LangChain (TypeScript)

By Cyprian AaronsUpdated 2026-04-21
memory-not-persisting-when-scalinglangchaintypescript

When memory stops persisting after you scale a LangChain app, it usually means your chat history is living in process memory instead of shared storage. It works on one local instance, then breaks the moment you run multiple pods, serverless invocations, or a load balancer sends the next request to a different node.

The symptom is usually this: the chain runs, but chat_history comes back empty on the next request. In some setups you’ll also see logs like BufferMemory appearing to “forget” previous turns, even though the code looked fine in development.

The Most Common Cause

The #1 cause is using an in-memory memory store such as BufferMemory or ChatMessageHistory without a persistent backend. That works only inside one Node.js process.

Here’s the broken pattern:

// broken.ts
import { BufferMemory } from "langchain/memory";
import { ConversationChain } from "langchain/chains";
import { ChatOpenAI } from "@langchain/openai";

const memory = new BufferMemory({
  returnMessages: true,
  memoryKey: "chat_history",
});

const chain = new ConversationChain({
  llm: new ChatOpenAI({ modelName: "gpt-4o-mini" }),
  memory,
});

export async function handler(req: Request) {
  const { message } = await req.json();
  const res = await chain.invoke({ input: message });
  return Response.json(res);
}

And here’s the fixed pattern:

// fixed.ts
import { BufferMemory } from "langchain/memory";
import { ConversationChain } from "langchain/chains";
import { ChatOpenAI } from "@langchain/openai";
import { RedisChatMessageHistory } from "@langchain/community/stores/message/ioredis";

export async function handler(req: Request) {
  const { message, sessionId } = await req.json();

  const history = new RedisChatMessageHistory({
    sessionId,
    sessionTTL: 60 * 60 * 24,
    url: process.env.REDIS_URL!,
  });

  const memory = new BufferMemory({
    returnMessages: true,
    memoryKey: "chat_history",
    chatHistory: history,
  });

  const chain = new ConversationChain({
    llm: new ChatOpenAI({ modelName: "gpt-4o-mini" }),
    memory,
  });

  const res = await chain.invoke({ input: message });
  return Response.json(res);
}

Why this breaks when scaling

  • BufferMemory stores state in RAM.
  • In Kubernetes, Lambda, Vercel, or any multi-instance deployment, RAM is not shared.
  • If request A hits pod 1 and request B hits pod 2, pod 2 has no idea what happened before.

The fix

Use a shared store:

  • Redis
  • Postgres
  • DynamoDB
  • Any durable session store you can key by user/session ID

If you want persistence across instances, your chat_history must be externalized.

Other Possible Causes

1. You create a new chain and memory object on every request

This looks harmless, but if your history backend is also in-memory, every request starts fresh.

// broken
export async function handler(req: Request) {
  const memory = new BufferMemory({ memoryKey: "chat_history" });
  const chain = new ConversationChain({
    llm: new ChatOpenAI({ modelName: "gpt-4o-mini" }),
    memory,
  });

  return Response.json(await chain.invoke({ input: "hello" }));
}

Fix:

// better
const sharedHistoryStore = new RedisChatMessageHistory({
  sessionId: "user-123",
  url: process.env.REDIS_URL!,
});

2. Your session key changes between requests

If sessionId is derived from a volatile value like a request ID or anonymous cookie that rotates, persistence will look broken even though storage works.

// bad session identity
const sessionId = crypto.randomUUID();

Use a stable identifier:

  • authenticated user ID
  • account ID
  • conversation ID stored client-side
const sessionId = req.headers.get("x-user-id")!;

3. You’re using the wrong LangChain class for your version

LangChain TypeScript has changed APIs across versions. A common mismatch is mixing old ConversationChain/BufferMemory patterns with newer runnable-based flows and expecting memory behavior to be identical.

If you see errors like:

  • TypeError: chain.invoke is not a function
  • Cannot read properties of undefined (reading 'loadMemoryVariables')

check your package versions:

npm ls langchain @langchain/core @langchain/openai @langchain/community

Make sure the imports match the version you installed.

4. Your platform is resetting the filesystem or runtime between invocations

This happens on serverless platforms when people try to persist chat state to local disk.

// fragile in serverless
await fs.writeFile("/tmp/chat-history.json", JSON.stringify(history));

That may work locally and fail under scale because:

  • /tmp is ephemeral
  • instances are not sticky
  • cold starts discard runtime state

Use external storage instead of local files.

How to Debug It

  1. Log the session ID on every request

    • Confirm it stays stable across turns.
    • If it changes, your “memory bug” is really an identity bug.
  2. Log what storage backend you’re actually using

    • If you see BufferMemory with no chatHistory, you’re probably storing state in process memory.
    • Add logs around initialization:
    console.log("sessionId", sessionId);
    console.log("redisUrl set?", Boolean(process.env.REDIS_URL));
    
  3. Check whether requests hit different instances

    • In Kubernetes or ECS, log hostname/pod name.
    • If turn one and turn two land on different pods and you use in-memory storage, persistence will fail by design.
  4. Inspect package versions and imports

    • Mixed imports from langchain, @langchain/core, and @langchain/community can produce subtle breakage.
    • Verify that your history class actually matches your installed version.

Prevention

  • Use a shared chat history backend from day one:
    • Redis for speed
    • Postgres for durability and auditability
  • Key every conversation with a stable session identifier.
  • Treat local memory classes like BufferMemory as single-process tools only; they are not persistence.

If you want this to survive scale tests, don’t ask LangChain to remember things it never stored outside RAM. Store history in Redis or Postgres, pass a stable sessionId, and keep each request stateless except for its lookup into shared state.


Keep learning

By Cyprian Aarons, AI Consultant at Topiax.

Want the complete 8-step roadmap?

Grab the free AI Agent Starter Kit — architecture templates, compliance checklists, and a 7-email deep-dive course.

Get the Starter Kit

Related Guides