AutoGen Tutorial (TypeScript): implementing retry logic for advanced developers

By Cyprian AaronsUpdated 2026-04-21
autogenimplementing-retry-logic-for-advanced-developerstypescript

This tutorial shows you how to add retry logic around AutoGen agent calls in TypeScript without turning your app into a pile of nested try/catch blocks. You need this when model calls fail intermittently, tool execution flakes out, or you want deterministic backoff before giving up.

What You'll Need

  • Node.js 18+
  • TypeScript 5+
  • @autogenai/autogen
  • dotenv
  • An OpenAI API key set as OPENAI_API_KEY
  • A terminal and a project with npm or pnpm

Step-by-Step

  1. Start with a minimal AutoGen setup and load your environment variables. Keep the agent configuration boring; retry logic should sit outside the agent so it stays reusable across workflows.
import "dotenv/config";
import { AssistantAgent, UserMessage } from "@autogenai/autogen";

const agent = new AssistantAgent({
  name: "retry-agent",
  model: "gpt-4o-mini",
  systemMessage: "You are a concise assistant.",
});

async function main() {
  const response = await agent.run([new UserMessage("Say hello in one sentence.")]);
  console.log(response.messages.at(-1)?.content);
}

main().catch(console.error);
  1. Add a generic retry helper with exponential backoff and jitter. This is the part that makes the rest of your code production-friendly, because you can reuse it for model calls, tool calls, and even downstream HTTP requests.
type RetryOptions = {
  retries: number;
  baseDelayMs: number;
  maxDelayMs: number;
};

const sleep = (ms: number) => new Promise((resolve) => setTimeout(resolve, ms));

function isRetryableError(error: unknown): boolean {
  const message = error instanceof Error ? error.message : String(error);
  return /rate limit|timeout|temporarily unavailable|429|503/i.test(message);
}

async function withRetry<T>(
  fn: () => Promise<T>,
  options: RetryOptions,
): Promise<T> {
  let lastError: unknown;

  for (let attempt = 0; attempt <= options.retries; attempt++) {
    try {
      return await fn();
    } catch (error) {
      lastError = error;
      if (attempt === options.retries || !isRetryableError(error)) throw error;

      const delay = Math.min(
        options.maxDelayMs,
        options.baseDelayMs * 2 ** attempt + Math.floor(Math.random() * 100),
      );
      await sleep(delay);
    }
  }

  throw lastError;
}
  1. Wrap the AutoGen call instead of wrapping the whole application. That keeps retries scoped to the failing operation and avoids replaying unrelated side effects like database writes or event publication.
import "dotenv/config";
import { AssistantAgent, UserMessage } from "@autogenai/autogen";

const agent = new AssistantAgent({
  name: "retry-agent",
  model: "gpt-4o-mini",
});

async function runWithRetry(prompt: string) {
  return withRetry(
    async () => {
      const result = await agent.run([new UserMessage(prompt)]);
      return result.messages.at(-1)?.content ?? "";
    },
    { retries: 4, baseDelayMs: 500, maxDelayMs: 8000 },
  );
}

runWithRetry("Summarize retry logic in one sentence.")
  .then(console.log)
  .catch((error) => {
    console.error("Final failure:", error);
    process.exitCode = 1;
  });
  1. Make retries safe for tool-heavy agents by separating pure generation from side effects. If your agent triggers tools that mutate state, retry only the model step or make the tool idempotent with request IDs.
type OrderResult = { orderId: string; status: string };

async function createOrderOnce(orderId: string): Promise<OrderResult> {
  return withRetry(
    async () => {
      const response = await agent.run([
        new UserMessage(`Return JSON only for order ${orderId}: {"orderId":"...","status":"created"}`),
      ]);

      const content = response.messages.at(-1)?.content ?? "{}";
      return JSON.parse(content) as OrderResult;
    },
    { retries: 3, baseDelayMs: 300, maxDelayMs: 5000 },
  );
}
  1. If you need observability, log attempt counts and delay values explicitly. In production, this is what lets you distinguish a transient provider issue from a broken prompt or malformed tool output.
async function withRetryLogged<T>(
  name: string,
  fn: () => Promise<T>,
): Promise<T> {
  return withRetry(async () => fn(), { retries: 4, baseDelayMs: 400, maxDelayMs: 6000 }).catch((error) => {
    console.error(`[${name}] failed after retries`);
    throw error;
  });
}

Testing It

Run the script normally first and confirm you get a response on the first attempt. Then temporarily force a failure by changing the model name to an invalid value or by disconnecting network access; you should see retries happen before the final throw.

If you want a cleaner test, stub agent.run() in a unit test so it fails twice and succeeds on the third call. Verify that your helper waits between attempts and does not retry non-retryable errors like JSON parse failures unless you explicitly want that behavior.

Next Steps

  • Add per-error retry policies for rate limits, timeouts, and invalid outputs
  • Wrap retries in circuit breakers so bad dependencies stop burning tokens
  • Move from simple backoff to queue-based orchestration when multiple agents share one provider quota

Keep learning

By Cyprian Aarons, AI Consultant at Topiax.

Want the complete 8-step roadmap?

Grab the free AI Agent Starter Kit — architecture templates, compliance checklists, and a 7-email deep-dive course.

Get the Starter Kit

Related Guides