How to Fix 'chain execution stuck when scaling' in AutoGen (TypeScript)
When AutoGen chain execution gets “stuck when scaling,” it usually means your agent workflow is waiting on something that never resolves: a tool call, a message handoff, or an async task that was started but not properly awaited. In TypeScript, this shows up most often once you move from one-off demos to parallel runs, nested agents, or long-lived conversations.
The symptom is usually one of these:
- •
chain execution stuck when scaling - •
TimeoutError: Operation timed out after ... - •
Error: AgentRuntime is not responding - •a conversation that logs the first few turns and then stops forever
The Most Common Cause
The #1 cause is firing async agent work inside a loop or callback without awaiting it, then letting multiple runs compete for the same runtime, shared state, or message queue.
In AutoGen TypeScript, this often happens with AssistantAgent, UserProxyAgent, or RoundRobinGroupChat when you start multiple chats concurrently and reuse the same agent instances.
Broken pattern vs fixed pattern
| Broken | Fixed |
|---|---|
| Starts multiple executions with shared agents | Serializes or isolates execution per run |
Uses forEach(async () => ...) | Uses for...of with await or Promise.allSettled with isolated state |
| Reuses the same chat/session object | Creates a fresh chat/runtime per request |
import { AssistantAgent, UserProxyAgent } from "@autogen/core";
const assistant = new AssistantAgent({
name: "assistant",
systemMessage: "You are a helpful assistant.",
});
const user = new UserProxyAgent({
name: "user",
});
requests.forEach(async (request) => {
// WRONG: forEach does not await async callbacks
const result = await user.initiateChat(assistant, request.prompt);
console.log(result);
});
import { AssistantAgent, UserProxyAgent } from "@autogen/core";
for (const request of requests) {
// RIGHT: sequential execution with proper awaiting
const assistant = new AssistantAgent({
name: `assistant-${request.id}`,
systemMessage: "You are a helpful assistant.",
});
const user = new UserProxyAgent({
name: `user-${request.id}`,
});
const result = await user.initiateChat(assistant, request.prompt);
console.log(result);
}
If you need concurrency, don’t share mutable agent instances across jobs unless the library explicitly documents that as safe. Create isolated agents per job or use a worker queue.
Other Possible Causes
1) Missing await on the top-level chat call
This looks harmless in code review and is brutal in production. The process exits early or continues before the chain finishes.
// Broken
const result = user.initiateChat(assistant, "Summarize this contract");
console.log("done");
// Fixed
const result = await user.initiateChat(assistant, "Summarize this contract");
console.log("done");
2) Tool/function handler never resolves
If you registered tools with AutoGen and the handler hangs, the chain waits forever on a tool call like FunctionCall / ToolCall.
// Broken
tools.set("lookupPolicy", async (policyId: string) => {
await fetchPolicy(policyId); // if fetchPolicy hangs or never returns
});
// Fixed
tools.set("lookupPolicy", async (policyId: string) => {
const controller = new AbortController();
const timeout = setTimeout(() => controller.abort(), 5000);
try {
return await fetchPolicy(policyId, { signal: controller.signal });
} finally {
clearTimeout(timeout);
}
});
3) Deadlock from nested agent calls
Calling one AutoGen chain from inside another can deadlock if both share the same runtime/event loop resources.
// Broken
assistant.onMessage(async (msg) => {
const summary = await summarizeAgent.initiateChat(assistant, msg.content);
return summary;
});
// Fixed
assistant.onMessage(async (msg) => {
queueMicrotask(() => {
void summarizeAsync(msg.content);
});
});
If you must nest calls, isolate them into separate runtimes or separate service boundaries.
4) Unbounded message history / context growth
Scaling failures often show up only after enough turns. Once context gets large enough, responses slow down and the chain looks stuck.
// Broken: unbounded history
const chat = new RoundRobinGroupChat({
participants,
});
// Fixed: cap history / trim state between turns
const chat = new RoundRobinGroupChat({
participants,
maxTurns: 12,
});
Also trim messages before passing them into follow-up chains. Don’t keep appending every intermediate artifact forever.
How to Debug It
- •
Check whether you’re sharing agent instances across requests
- •If one
AssistantAgentserves many concurrent chats, clone it per job. - •Look for singleton patterns in your DI container or module scope.
- •If one
- •
Verify every async boundary is awaited
- •Search for:
- •
forEach(async - •missing
await - •
.then(...)chains without return values
- •
- •Add logs before and after each awaited call to find the exact stall point.
- •Search for:
- •
Instrument tool handlers
- •Put timing around every function/tool call.
- •If you see logs like “tool started” but never “tool finished,” the hang is in your handler, not AutoGen.
- •
Run with one request only
- •If single-request mode works and parallel mode hangs, it’s almost always shared mutable state.
- •Reduce concurrency to
1, then increase until it breaks.
Prevention
- •Create one chat runtime per request unless you have a documented reason not to.
- •Put timeouts on all external calls used by tools:
- •HTTP APIs
- •database queries
- •queue consumers
- •Avoid
forEach(async ...)in orchestration code. Use:- •
for...offor ordered execution - •
Promise.allSettled()for controlled concurrency
- •
The real fix is usually boring: isolate state, await everything, and put hard timeouts around every external dependency. Once you do that, AutoGen TypeScript stops “sticking” and starts behaving like a normal distributed workflow instead of an accidental deadlock machine.
Keep learning
- •The complete AI Agents Roadmap — my full 8-step breakdown
- •Free: The AI Agent Starter Kit — PDF checklist + starter code
- •Work with me — I build AI for banks and insurance companies
By Cyprian Aarons, AI Consultant at Topiax.
Want the complete 8-step roadmap?
Grab the free AI Agent Starter Kit — architecture templates, compliance checklists, and a 7-email deep-dive course.
Get the Starter Kit