How to Fix 'duplicate tool calls when scaling' in LangChain (TypeScript)

By Cyprian AaronsUpdated 2026-04-21

duplicate-tool-calls-when-scalinglangchaintypescript

What the error means

duplicate tool calls when scaling usually means your agent is receiving the same tool invocation more than once, or your app is dispatching the same assistant message through multiple workers. In LangChain TypeScript, this shows up most often when you run an agent in a horizontally scaled setup, retry a request without idempotency, or process the same event twice from a queue.

The symptom is usually noisy logs around AIMessage.tool_calls, repeated ToolMessage execution, or downstream side effects happening twice: duplicate DB writes, duplicate API calls, duplicate tickets.

The Most Common Cause

The #1 cause is re-processing the same LLM response across multiple workers or retries without deduping by message ID / run ID.

This happens when you store only the raw messages and then let two instances continue the same conversation. Both instances see the same AIMessage with tool calls and both execute them.

Broken vs fixed pattern

Broken pattern	Fixed pattern
Multiple workers consume the same job and both call `agentExecutor.invoke()`	One worker owns a run, or you dedupe by `runId` / message ID before executing tools
No idempotency key on side-effecting tools	Tool execution checks a stable request ID before doing work

// ❌ Broken: same payload can be processed twice by two workers
import { AgentExecutor } from "langchain/agents";
import { createOpenAIFunctionsAgent } from "langchain/agents";
import { ChatOpenAI } from "@langchain/openai";

const llm = new ChatOpenAI({ model: "gpt-4o-mini" });
const agent = await createOpenAIFunctionsAgent({
  llm,
  tools,
  prompt,
});

const executor = new AgentExecutor({ agent, tools });

queue.consume(async (job) => {
  // If this job is redelivered or picked up by another worker,
  // tool calls can execute twice.
  const result = await executor.invoke({
    input: job.data.input,
    chat_history: job.data.chat_history,
  });

  await saveResult(result);
});

// ✅ Fixed: dedupe by run/job id and make tools idempotent
import { randomUUID } from "crypto";

const processedRuns = new Set<string>();

queue.consume(async (job) => {
  const runId = job.data.runId ?? job.id;

  if (processedRuns.has(runId)) return;
  processedRuns.add(runId);

  const result = await executor.invoke(
    {
      input: job.data.input,
      chat_history: job.data.chat_history,
    },
    {
      configurable: { runId },
    }
  );

  await saveResult(result);
});

// Example idempotent tool
const createTicketTool = tool(
  async ({ externalRequestId, title }: { externalRequestId: string; title: string }) => {
    const existing = await db.ticket.findUnique({ where: { externalRequestId } });
    if (existing) return existing;

    return db.ticket.create({
      data: { externalRequestId, title },
    });
  },
  {
    name: "create_ticket",
    description: "Create a support ticket once",
    schema: z.object({
      externalRequestId: z.string(),
      title: z.string(),
    }),
  }
);

If you’re using ToolMessage flow directly, the same rule applies: one logical tool call needs one durable execution record. Don’t rely on in-memory state once you scale beyond one process.

Other Possible Causes

1) Retry middleware replays the same assistant turn

If you wrap model calls with retries at the wrong layer, LangChain may re-run the entire agent step and emit the same tool_calls again.

// Problematic if it retries after tool selection but before persistence
const model = new ChatOpenAI({
  model: "gpt-4o-mini",
  maxRetries: 3,
});

Fix by retrying only transient network failures, not completed agent turns. Keep retries below the orchestration layer if possible.

2) Parallel branches invoke the same tool

If you fan out work with Promise.all() and each branch can trigger a tool call, you can easily duplicate side effects.

// ❌ Both branches may call the same write tool
await Promise.all([
  executor.invoke({ input }),
  executor.invoke({ input }),
]);

Use one orchestrator per conversation, or gate write tools behind a single queue consumer.

3) Memory is shared across requests

A shared BufferMemory or reused message array can cause one request to inherit another request’s pending tool calls.

// ❌ Shared mutable state across users/requests
const memory = new BufferMemory();

app.post("/chat", async (req, res) => {
  const result = await executor.invoke({
    input: req.body.input,
    memory,
  });
});

Use per-session storage keyed by user/session ID, and never reuse mutable message arrays between concurrent requests.

4) Tool execution is not idempotent

Even if LangChain behaves correctly, your side effect may not. A second identical create_order call should no-op if it sees the same business key.

// ❌ No guard against duplicate side effects
async function chargeCard({ orderId }: { orderId: string }) {
  return payments.charge({ orderId });
}

Add an idempotency key at the API boundary:

async function chargeCard({ orderId, idempotencyKey }: { orderId: string; idempotencyKey: string }) {
  return payments.charge({ orderId, idempotencyKey });
}

How to Debug It

•
Log every run with a stable identifier
- •Print runId, session ID, queue job ID, and conversation ID.
- •If two executions share the same IDs, you have replay or redelivery.
•
Inspect emitted tool calls
- •Log AIMessage.tool_calls, tool name, args, and message ID.
- •If the exact same tool_call_id appears twice, your orchestration is duplicating execution.
•
Check worker topology
- •Confirm whether more than one consumer can process the same queue item.
- •Look for at-least-once delivery behavior in BullMQ, SQS, RabbitMQ, Kafka consumers, or serverless retries.
•
Disable retries and parallelism temporarily
- •Turn off model retries.
- •Replace Promise.all() with sequential execution.
- •If duplicates disappear, the bug is in orchestration rather than LangChain itself.

Prevention

•Make every side-effecting tool idempotent using a business key or idempotency key.
•Store conversation state per session/run; never share mutable memory across requests.
•Treat agent execution as at-least-once by default and design for deduplication at the application layer.

Keep learning

•The complete AI Agents Roadmap — my full 8-step breakdown
•Free: The AI Agent Starter Kit — PDF checklist + starter code
•Work with me — I build AI for banks and insurance companies

By Cyprian Aarons, AI Consultant at Topiax.

ShareX / Twitter LinkedIn

Want the complete 8-step roadmap?

Grab the free AI Agent Starter Kit — architecture templates, compliance checklists, and a 7-email deep-dive course.

Get the Starter Kit