LangGraph Tutorial (TypeScript): implementing retry logic for intermediate developers

By Cyprian AaronsUpdated 2026-04-22
langgraphimplementing-retry-logic-for-intermediate-developerstypescript

This tutorial shows you how to add retry logic to a LangGraph workflow in TypeScript so failed nodes can be retried without crashing the whole run. You need this when you call flaky external APIs, rate-limited services, or anything that can fail transiently and should be retried before you give up.

What You'll Need

  • Node.js 18+
  • TypeScript 5+
  • @langchain/langgraph
  • @langchain/core
  • An API key only if your graph calls an external model or service
  • A project with "type": "module" in package.json or a compatible TS runtime setup

Install the packages:

npm install @langchain/langgraph @langchain/core
npm install -D typescript tsx @types/node

Step-by-Step

  1. Start with a small graph that has one unreliable node. The retry logic will be easier to reason about if you isolate the failure point instead of wrapping the whole graph.
import { StateGraph, Annotation } from "@langchain/langgraph";

const GraphState = Annotation.Root({
  attempts: Annotation<number>({
    reducer: (current, update) => update ?? current,
    default: () => 0,
  }),
  result: Annotation<string | null>({
    reducer: (current, update) => update ?? current,
    default: () => null,
  }),
});

type GraphStateType = typeof GraphState.State;

const flakyNode = async (state: GraphStateType) => {
  const nextAttempt = state.attempts + 1;

  if (nextAttempt < 3) {
    throw new Error(`Transient failure on attempt ${nextAttempt}`);
  }

  return {
    attempts: nextAttempt,
    result: `Succeeded on attempt ${nextAttempt}`,
  };
};
  1. Add a retry wrapper around the node. This is the core pattern: catch transient errors, wait briefly, and try again before rethrowing.
const sleep = (ms: number) => new Promise((resolve) => setTimeout(resolve, ms));

async function withRetry<T>(
  fn: () => Promise<T>,
  options: { retries: number; delayMs: number }
): Promise<T> {
  let lastError: unknown;

  for (let attempt = 0; attempt <= options.retries; attempt++) {
    try {
      return await fn();
    } catch (error) {
      lastError = error;
      if (attempt < options.retries) {
        await sleep(options.delayMs * (attempt + 1));
        continue;
      }
    }
  }

  throw lastError;
}
  1. Build the graph using the wrapped node. The graph itself stays simple; retry behavior lives at the edge where failures happen.
const workflow = new StateGraph(GraphState)
  .addNode("flaky", async (state) =>
    withRetry(() => flakyNode(state), { retries: 2, delayMs: 250 })
  )
  .addEdge("__start__", "flaky")
  .addEdge("flaky", "__end__")
  .compile();
  1. Run it and inspect the final state. Because the node fails twice and succeeds on the third attempt, you should see a successful result instead of an exception.
async function main() {
  const output = await workflow.invoke({
    attempts: 0,
    result: null,
  });

  console.log(output);
}

main().catch((error) => {
  console.error("Workflow failed:", error);
});
  1. If you want more control, make retries conditional. In production, you usually retry only transient failures like timeouts, HTTP 429s, or upstream connection resets.
function isRetryable(error: unknown): boolean {
  if (!(error instanceof Error)) return false;

  return (
    error.message.includes("Transient") ||
    error.message.includes("timeout") ||
    error.message.includes("429")
  );
}

async function withConditionalRetry<T>(
  fn: () => Promise<T>,
  options: { retries: number; delayMs: number }
): Promise<T> {
  let lastError: unknown;

  for (let attempt = 0; attempt <= options.retries; attempt++) {
    try {
      return await fn();
    } catch (error) {
      lastError = error;
      if (!isRetryable(error) || attempt === options.retries) break;
      await sleep(options.delayMs * (attempt + 1));
    }
  }

  throw lastError;
}

Testing It

Run the file with tsx and confirm that the graph completes successfully after two retries. You should see attempts equal to 3 and result set to "Succeeded on attempt 3".

Then change retries to 1 and rerun it. The workflow should fail because the node needs three attempts to succeed.

Finally, replace the thrown error message with something non-retryable and verify that your conditional retry helper stops immediately. That check matters when you do not want to hide real bugs behind repeated retries.

Next Steps

  • Add exponential backoff with jitter instead of a fixed delay.
  • Move retry policy into a reusable helper for all tool-calling nodes.
  • Combine retries with circuit breaking so bad dependencies stop burning compute.

Keep learning

By Cyprian Aarons, AI Consultant at Topiax.

Want the complete 8-step roadmap?

Grab the free AI Agent Starter Kit — architecture templates, compliance checklists, and a 7-email deep-dive course.

Get the Starter Kit

Related Guides