How to Fix 'duplicate tool calls when scaling' in LangGraph (TypeScript)

By Cyprian AaronsUpdated 2026-04-21
duplicate-tool-calls-when-scalinglanggraphtypescript

When LangGraph starts throwing duplicate tool calls when scaling, it usually means the same tool invocation is being executed more than once across parallel workers, retries, or replays. In practice, this shows up when you move from a single-process dev setup to multiple Node processes, serverless instances, or a graph that can resume from persisted state.

The failure is almost always about non-idempotent tool execution plus state being replayed without a dedupe key. If your tool writes to a database, sends a webhook, or mutates external state, duplicate execution becomes visible fast.

The Most Common Cause

The #1 cause is re-running the same ToolNode work after a retry or checkpoint restore without guarding against already-processed tool_call.id values.

In LangGraph JS/TS, the model emits AIMessage.tool_calls. If your graph resumes and the same assistant message is processed again, your tool runs again unless you explicitly dedupe by tool_call.id.

Broken patternFixed pattern
Executes tools every time the node runsTracks processed tool call IDs in state
No idempotency guardSkips already-seen calls
Safe in local dev, breaks under scalingStable under retries and replay
// BROKEN: duplicate tool execution on replay/resume
import { StateGraph, START, END } from "@langchain/langgraph";
import { ToolNode } from "@langchain/langgraph/prebuilt";
import { z } from "zod";

const sendEmailTool = {
  name: "send_email",
  description: "Send an email",
  schema: z.object({
    to: z.string(),
    subject: z.string(),
  }),
  func: async ({ to, subject }: { to: string; subject: string }) => {
    await fetch("https://api.mailer.local/send", {
      method: "POST",
      body: JSON.stringify({ to, subject }),
    });
    return "sent";
  },
};

const tools = new ToolNode([sendEmailTool]);

// If the same AIMessage is replayed after a retry/checkpoint restore,
// ToolNode executes send_email again.
// FIXED: dedupe by tool_call.id in state before executing side effects
import { StateGraph, START, END } from "@langchain/langgraph";
import { AIMessage, ToolMessage } from "@langchain/core/messages";
import { z } from "zod";

type GraphState = {
  messages: Array<AIMessage | ToolMessage>;
  seenToolCallIds: string[];
};

const sendEmailTool = {
  name: "send_email",
  description: "Send an email",
  schema: z.object({
    to: z.string(),
    subject: z.string(),
  }),
  func: async ({ to, subject }: { to: string; subject: string }) => {
    await fetch("https://api.mailer.local/send", {
      method: "POST",
      body: JSON.stringify({ to, subject }),
    });
    return "sent";
  },
};

async function executeTools(state: GraphState): Promise<Partial<GraphState>> {
  const last = state.messages[state.messages.length - 1];
  if (!(last instanceof AIMessage)) return {};

  const newMessages: ToolMessage[] = [];
  const seen = new Set(state.seenToolCallIds);

  for (const call of last.tool_calls ?? []) {
    if (seen.has(call.id)) continue;

    seen.add(call.id);
    const result = await sendEmailTool.func(call.args as any);

    newMessages.push(
      new ToolMessage({
        content: result,
        tool_call_id: call.id,
      })
    );
  }

  return {
    messages: [...state.messages, ...newMessages],
    seenToolCallIds: [...seen],
  };
}

This matters because LangGraph will happily replay deterministic state transitions. Your external side effect is not deterministic unless you make it so.

Other Possible Causes

1) Multiple workers processing the same checkpoint

If you run several Node processes against the same persistence layer and don’t enforce a single active consumer per thread/run, two workers can pick up the same task.

// Example symptom:
// both workers load thread_id="abc123" and execute the same pending AIMessage

const config = {
  configurable: {
    thread_id: "abc123",
  },
};

Fix it with one of these:

  • A queue with exclusive leasing
  • Per-thread locking in Redis/Postgres
  • Exactly-one worker per thread_id

2) Retrying the whole graph instead of retrying only the failed step

A top-level retry wrapper can re-run the model node and then the tool node again.

// BROKEN
for (let attempt = 0; attempt < 3; attempt++) {
  await graph.invoke(input); // full replay
}

Prefer step-level retries or idempotent tools.

// Better
await graph.invoke(input, {
  configurable: { thread_id },
});

Then let LangGraph resume from checkpointed state instead of re-invoking everything manually.

3) Non-idempotent external tools

If your tool sends payments, creates tickets, or posts Slack messages, duplicate calls are visible even if LangGraph is behaving correctly.

async function createTicket({ title }: { title: string }) {
  // BROKEN if called twice
  return fetch("/tickets", {
    method: "POST",
    body: JSON.stringify({ title }),
  });
}

Add an idempotency key:

async function createTicket({ title }: { title: string }, key: string) {
  return fetch("/tickets", {
    method: "POST",
    headers: { "Idempotency-Key": key },
    body: JSON.stringify({ title }),
  });
}

Use tool_call.id as that key whenever possible.

4) Reconstructing messages incorrectly after persistence

A common bug is appending old messages twice when hydrating from storage.

// BROKEN
const saved = await loadThread(threadId);
const messages = [...saved.messages, ...incoming.messages];

If incoming.messages already includes persisted history, you just doubled it. That leads to repeated tool calls on the next model turn.

How to Debug It

  1. Log every tool_call.id before execution

    • If you see the same ID twice, this is a replay/dedupe problem.
    • If you see different IDs with identical arguments, your prompt/model is generating duplicates.
  2. Check whether duplicates happen only after restart or scale-out

    • If yes, suspect checkpoint restore or multiple workers.
    • If no, inspect your agent loop for accidental double invocation.
  3. Inspect your persistence layer

    • Verify one thread/run maps to one active consumer.
    • In Postgres/Redis setups, look for concurrent readers on the same checkpoint row/key.
  4. Temporarily make the tool side-effect free

    • Replace real network writes with console logs.
    • If duplicates still appear in logs but not in downstream systems, your issue is execution flow.
    • If duplicates disappear once side effects are removed, you need idempotency keys.

Prevention

  • Make every external tool idempotent.
  • Store processed tool_call.id values in graph state or durable storage.
  • Use checkpoints plus single-consumer execution per thread/run.
  • Never wrap the whole graph in blind retries if tools mutate external systems.

If you’re seeing this error specifically under scale and not locally, treat it as a distributed systems bug first and a LangGraph bug second. In most cases the fix is boring but effective: dedupe by tool_call.id, lock per thread, and make side effects idempotent.


Keep learning

By Cyprian Aarons, AI Consultant at Topiax.

Want the complete 8-step roadmap?

Grab the free AI Agent Starter Kit — architecture templates, compliance checklists, and a 7-email deep-dive course.

Get the Starter Kit

Related Guides