How to Fix 'memory not persisting when scaling' in AutoGen (TypeScript)
If your AutoGen TypeScript agent keeps “forgetting” memory after you scale from one process to multiple replicas, the issue is usually not AutoGen itself. It means your memory is living in process-local state, so every new pod, worker, or serverless instance starts with an empty store.
You’ll usually see this when moving from local dev to Docker, Kubernetes, PM2 cluster mode, or a serverless deployment. The symptom is simple: the agent answers correctly on one request, then behaves like it has no prior context on the next request.
The Most Common Cause
The #1 cause is using in-memory storage for memory objects, then scaling horizontally.
In AutoGen TypeScript, this often happens when you create Memory / MemoryStore instances inside the request handler or inside each worker process. That works on a single instance because the same process handles all requests. Once you scale out, each replica gets its own isolated memory.
Broken vs fixed pattern
| Broken pattern | Fixed pattern |
|---|---|
| Memory store created per request/process | Shared persistent store injected into all instances |
| Works locally | Fails when scaled |
Uses new InMemory...() | Uses Redis/Postgres/file-backed persistence |
// BROKEN: each replica gets its own memory
import { AssistantAgent } from "@autogen/agent";
import { InMemoryStore } from "@autogen/memory";
export async function createAgent() {
const memory = new InMemoryStore(); // local-only
const agent = new AssistantAgent({
name: "support-agent",
modelClient,
memory,
});
return agent;
}
// FIXED: shared persistent store
import { AssistantAgent } from "@autogen/agent";
import { RedisMemoryStore } from "@autogen/memory-redis";
const memory = new RedisMemoryStore({
url: process.env.REDIS_URL!,
keyPrefix: "autogen:support-agent:",
});
export function createAgent() {
return new AssistantAgent({
name: "support-agent",
modelClient,
memory,
});
}
If you’re using RoundRobinGroupChat, AssistantAgent, or any custom agent wrapper, the rule is the same: do not instantiate memory inside ephemeral runtime scope. Build it once and point it at durable storage.
Other Possible Causes
1) You’re not using a stable session key
If your memory lookup key changes between requests, AutoGen will behave like it has no history.
// BAD: random key every request
const sessionId = crypto.randomUUID();
await memory.save(sessionId, message);
// GOOD: stable user/session identifier
const sessionId = req.headers["x-session-id"] as string;
await memory.save(sessionId, message);
If you are behind a load balancer, make sure the same logical conversation always resolves to the same session ID.
2) Your deployment is restarting workers
With PM2 cluster mode or Kubernetes rolling updates, workers die and come back clean. Any Map, singleton, or module-scoped cache disappears with them.
// BAD: module-scoped cache only lives in one worker
const conversationCache = new Map<string, unknown>();
Use Redis or another external store instead:
// GOOD: external persistence
await redis.hSet(`conv:${sessionId}`, {
lastMessage: JSON.stringify(message),
});
3) You are serializing the wrong object shape
Some teams persist only part of the agent state and forget the actual memory payload. Then restore code runs without error, but there is nothing useful to hydrate.
// BAD: saving config but not memory contents
await db.save("agentState", {
name: agent.name,
model: "gpt-4.1",
});
// GOOD: persist both config and conversation state
await db.save("agentState", {
name: agent.name,
model: "gpt-4.1",
memoryItems: await memory.list(sessionId),
});
This matters if you rebuild agents on every request instead of keeping them warm.
4) Your adapter does not support concurrent writes
Some custom stores look fine until two requests write at once. Then one update overwrites another and you get partial or missing history.
// BAD: naive overwrite
await fs.writeFile(path, JSON.stringify(messages));
Prefer atomic operations or a datastore with transactions:
// GOOD: append in Redis transaction or DB transaction
await redis.multi()
.rPush(`conv:${sessionId}`, JSON.stringify(message))
.exec();
How to Debug It
- •
Log the session ID and worker identity
- •Print
sessionId, pod name, PID, and request ID. - •If the same user hits different workers with different IDs, your “memory loss” may just be routing plus local storage.
- •Print
- •
Inspect where memory is instantiated
- •Search for
new InMemoryStore(),new Map(), or any store created inside handlers. - •If it’s inside a route function or Lambda handler, that’s your first fix.
- •Search for
- •
Check persistence after one write
- •Write a message.
- •Read it back immediately from the backing store.
- •If read-after-write fails, your problem is storage durability, not AutoGen logic.
- •
Temporarily force single-instance execution
- •Run one pod / one worker / one Node process.
- •If persistence works there but fails under scale-out, you’ve confirmed process-local state leakage.
A useful sanity check is to compare what your app thinks it stored versus what actually exists in Redis/Postgres/file storage. If AutoGen throws something like Error: No messages found for session ... or your own wrapper logs an empty history after a successful turn, that points directly at state isolation.
Prevention
- •Use external persistence by default for anything conversational:
- •Redis for short-lived session memory
- •Postgres for durable audit/history storage
- •Treat in-memory stores as dev-only:
- •Fine for local tests
- •Not acceptable for horizontally scaled production agents
- •Keep session IDs stable and explicit:
- •Pass them from your API layer
- •Never generate a new ID per request unless that’s intentional
If you want AutoGen TypeScript agents to survive scaling events, design memory like infrastructure, not like application state. Once you move state out of the process and into a real backing store, this class of bug disappears fast.
Keep learning
- •The complete AI Agents Roadmap — my full 8-step breakdown
- •Free: The AI Agent Starter Kit — PDF checklist + starter code
- •Work with me — I build AI for banks and insurance companies
By Cyprian Aarons, AI Consultant at Topiax.
Want the complete 8-step roadmap?
Grab the free AI Agent Starter Kit — architecture templates, compliance checklists, and a 7-email deep-dive course.
Get the Starter Kit