How to Fix 'intermittent 500 errors' in AutoGen (TypeScript)
Intermittent 500 errors in AutoGen TypeScript usually mean the request made it into your app, but something inside the agent flow failed before a clean response came back. In practice, this shows up when tool execution, message handling, or model calls are unstable under certain inputs or concurrency levels.
The key detail: “intermittent” almost always points to state, async timing, or malformed messages rather than a permanent config problem. If you only see it on some requests, start by looking at the agent lifecycle and any code that mutates shared objects.
The Most Common Cause
The #1 cause I see is throwing inside a tool or message handler without catching and converting the error into a structured result. In AutoGen, that often bubbles up as a server-side 500 with messages like:
- •
Error: Tool execution failed - •
TypeError: Cannot read properties of undefined - •
UnhandledPromiseRejectionWarning - •
AutoGenError: Failed to generate reply
Here’s the broken pattern versus the fixed pattern.
| Broken | Fixed |
|---|---|
| Tool throws directly | Tool returns safe error payload |
| Shared mutable state | Per-request state passed explicitly |
| No guard on optional fields | Validate before access |
// BROKEN
import { AssistantAgent } from "@autogen/agent";
import { FunctionTool } from "@autogen/tools";
const getCustomerBalance = new FunctionTool({
name: "getCustomerBalance",
description: "Fetch balance for a customer",
execute: async (args: { customerId?: string }) => {
// Throws when customerId is missing or invalid
const balance = await fetchBalance(args.customerId!.trim());
return { balance };
},
});
const agent = new AssistantAgent({
name: "support-agent",
modelClient,
tools: [getCustomerBalance],
});
// Somewhere in your route handler
export async function POST(req: Request) {
const body = await req.json();
const result = await agent.run(body.message);
return Response.json(result);
}
// FIXED
import { AssistantAgent } from "@autogen/agent";
import { FunctionTool } from "@autogen/tools";
const getCustomerBalance = new FunctionTool({
name: "getCustomerBalance",
description: "Fetch balance for a customer",
execute: async (args: { customerId?: string }) => {
if (!args.customerId || typeof args.customerId !== "string") {
return {
ok: false,
error: "customerId is required",
};
}
try {
const balance = await fetchBalance(args.customerId.trim());
return { ok: true, balance };
} catch (err) {
return {
ok: false,
error: err instanceof Error ? err.message : "Unknown tool failure",
};
}
},
});
export async function POST(req: Request) {
const body = await req.json();
try {
const agent = new AssistantAgent({
name: "support-agent",
modelClient,
tools: [getCustomerBalance],
});
const result = await agent.run(body.message);
return Response.json(result);
} catch (err) {
console.error("AutoGen request failed:", err);
return Response.json(
{ error: "Agent execution failed" },
{ status: 500 }
);
}
}
If your tool throws, AutoGen may surface it as a generic failure even though the real bug is one line deep in your own code. The fix is to make tools deterministic and non-throwing whenever possible.
Other Possible Causes
1. Reusing an agent instance across requests
If you keep one AssistantAgent in module scope and mutate its history or runtime state, concurrent requests can collide.
// Bad
const agent = new AssistantAgent({ name: "support-agent", modelClient });
export async function POST(req: Request) {
const { message } = await req.json();
return Response.json(await agent.run(message));
}
Create the agent per request unless you’ve confirmed the class is safe for concurrent reuse.
// Good
export async function POST(req: Request) {
const { message } = await req.json();
const agent = new AssistantAgent({ name: "support-agent", modelClient });
return Response.json(await agent.run(message));
}
2. Invalid chat payload shape
AutoGen TypeScript wrappers are strict about message structure. A malformed messages array can trigger failures like:
- •
Invalid message format - •
Role must be one of user|assistant|system - •
Cannot serialize undefined content
// Bad
await agent.run([
{ role: "user", content: undefined },
]);
// Good
await agent.run([
{ role: "user", content: String(input ?? "") },
]);
3. Model client timeout or rate limit spikes
Intermittent failures often come from upstream model calls, especially under burst traffic.
const modelClient = new OpenAIChatCompletionClient({
model: "gpt-4o-mini",
apiKey: process.env.OPENAI_API_KEY!,
timeoutMs: 30000,
});
If you see logs like 429 Too Many Requests or ETIMEDOUT, add retry logic with backoff at the boundary where you call the model client.
4. Non-idempotent tool side effects
If a tool writes to a database, sends email, or updates policy state and then gets retried, you can get partial failures that look random.
execute: async ({ claimId }) => {
// Risky if retried twice
await db.claims.update({ id: claimId }, { status: "reviewed" });
}
Use idempotency keys or check current state before writing.
How to Debug It
- •
Capture the full stack trace
- •Log the exact exception before AutoGen wraps it.
- •Look for your own code first, not just
AutoGenError.
- •
Disable tools one by one
- •Run the same prompt with no tools.
- •If the error disappears, re-enable tools individually until it returns.
- •
Log raw input and output shapes
- •Print the incoming request body.
- •Print tool args before execution.
- •Verify every message has
roleand non-emptycontent.
- •
Test concurrency
- •Fire multiple parallel requests against the same endpoint.
- •If failures increase under load, suspect shared mutable state or rate limiting.
Prevention
- •Create agents per request unless you have proven shared reuse is safe.
- •Make every tool validate inputs and return structured errors instead of throwing.
- •Add integration tests that run parallel requests against your AutoGen endpoint.
- •Log upstream failures separately from application failures so
500does not hide the real root cause.
If you’re seeing an intermittent 500, assume it’s one of three things first:
- •a thrown tool,
- •shared mutable state,
- •or bad message shape.
That covers most production AutoGen TypeScript failures I’ve debugged in real systems.
Keep learning
- •The complete AI Agents Roadmap — my full 8-step breakdown
- •Free: The AI Agent Starter Kit — PDF checklist + starter code
- •Work with me — I build AI for banks and insurance companies
By Cyprian Aarons, AI Consultant at Topiax.
Want the complete 8-step roadmap?
Grab the free AI Agent Starter Kit — architecture templates, compliance checklists, and a 7-email deep-dive course.
Get the Starter Kit