How to Fix 'duplicate tool calls' in LangChain (Python)

By Cyprian AaronsUpdated 2026-04-21

duplicate-tool-callslangchainpython

What this error means

duplicate tool calls in LangChain usually means the model produced the same tool invocation more than once, and your agent/runtime tried to execute or register it twice. You’ll typically see it when using AgentExecutor, create_tool_calling_agent, or a chat model with tool calling enabled and some retry/streaming logic layered on top.

In practice, this shows up when you reuse message history incorrectly, append AI messages manually, or let both LangChain and your own code drive tool execution.

The Most Common Cause

The #1 cause is replaying the same assistant/tool messages back into the next agent run.

This happens a lot when people store conversation state in a list, then feed the entire list back into AgentExecutor.invoke() without separating:

•user input
•assistant tool-call messages
•tool results

Broken pattern vs fixed pattern

Broken	Fixed
Reuses prior `AIMessage` with `tool_calls`	Passes only clean chat history or uses proper memory
Manually appends tool messages	Lets LangChain manage the tool loop
Re-invokes agent with already-consumed messages	Starts each run from fresh input state

# BROKEN
from langchain_openai import ChatOpenAI
from langchain_core.messages import HumanMessage, AIMessage, ToolMessage
from langchain.agents import create_tool_calling_agent, AgentExecutor

llm = ChatOpenAI(model="gpt-4o-mini")
tools = [get_weather_tool]

agent = create_tool_calling_agent(llm, tools, prompt)
executor = AgentExecutor(agent=agent, tools=tools)

messages = [
    HumanMessage(content="What's the weather in Nairobi?"),
]

result = executor.invoke({"input": messages})

# Bad: storing the assistant response including tool_calls
messages.append(result["output"])

# Bad: invoking again with already-used agent state
result2 = executor.invoke({"input": messages})

# FIXED
from langchain_openai import ChatOpenAI
from langchain_core.messages import HumanMessage
from langchain.agents import create_tool_calling_agent, AgentExecutor

llm = ChatOpenAI(model="gpt-4o-mini")
tools = [get_weather_tool]

agent = create_tool_calling_agent(llm, tools, prompt)
executor = AgentExecutor(agent=agent, tools=tools)

# Keep only user-facing conversation state here
chat_history = [
    HumanMessage(content="What's the weather in Nairobi?")
]

result = executor.invoke({"input": "What's the weather in Nairobi?"})

# If you need persistence, store the final text output,
# not raw internal tool-call messages.
chat_history.append(HumanMessage(content="Thanks"))

The key rule: don’t replay internal tool-call messages unless you know exactly how your agent is expecting them. In most apps, you should persist user/assistant text separately from LangChain’s internal tool execution trace.

Other Possible Causes

1) You’re using both manual tool execution and an agent executor

If your code calls the tool directly and also lets AgentExecutor run it, you’ll get duplicate execution behavior.

# BROKEN
tool_result = get_weather_tool.invoke({"location": "Nairobi"})
response = executor.invoke({"input": "What's the weather in Nairobi?"})

# FIXED
response = executor.invoke({"input": "What's the weather in Nairobi?"})

Pick one control plane:

•either let LangChain handle tools through the agent loop
•or call tools yourself and skip agent tool calling

2) Streaming callback code is re-processing partial chunks

With streaming models like ChatOpenAI(streaming=True), your callback handler may see partial deltas multiple times. If you treat each chunk as a full tool call, you can accidentally submit duplicates.

llm = ChatOpenAI(model="gpt-4o-mini", streaming=True)

# Bad: treating every streamed chunk as a complete response
for chunk in llm.stream("Check status"):
    handle_tool_call(chunk)

Fix by only acting on finalized tool-call events or by buffering until completion.

3) Retry logic is re-sending non-idempotent requests

If you wrapped agent execution in retries and a timeout occurs after a tool has already executed, the retry can repeat the same call.

from tenacity import retry

@retry()
def run_agent():
    return executor.invoke({"input": "Create ticket for customer 123"})

If that request creates side effects, make the downstream operation idempotent:

•use request IDs
•dedupe on backend side
•avoid blind retries around non-idempotent tools

4) Your prompt encourages repeated tool usage

Some prompts cause the model to keep “thinking” it needs to call the same function again. This is common when instructions are vague or when you don’t clearly tell the model when to stop.

prompt = ChatPromptTemplate.from_messages([
    ("system", "Use tools whenever useful."),
    ("human", "{input}")
])

Tighten it:

prompt = ChatPromptTemplate.from_messages([
    ("system", "Call each required tool at most once per user request. "
               "Do not repeat a tool call if you already have its result."),
    ("human", "{input}")
])

How to Debug It

•
Inspect the raw message list before invocation
- •Log every HumanMessage, AIMessage, and ToolMessage
- •Look for repeated tool_calls payloads or duplicated assistant turns
•
Turn off retries temporarily
- •Remove tenacity, HTTP retries, and wrapper-level retries
- •If the error disappears, your retry path is replaying state
•
Disable streaming
- •
  Run with non-streaming mode first:
```
llm = ChatOpenAI(model="gpt-4o-mini", streaming=False)
```
- •If duplicates vanish, your callback/event handling is likely double-counting chunks
•
Check whether your app stores internal LangChain messages
- •Don’t persist raw AIMessage(tool_calls=[...]) unless your resume logic explicitly supports it
- •Persist clean user/assistant text separately from execution metadata

Prevention

•
Use one owner for tool execution:
- •either LangChain agents
- •or manual orchestration
  Not both at once.
•
Keep persisted chat history clean:
- •store final assistant text
- •avoid replaying internal tool_calls and intermediate ToolMessages unless required
•
Make tools idempotent where possible:
- •add request IDs
- •dedupe on backend writes
- •design retries so they don’t create duplicate side effects

If you’re seeing this with AgentExecutor, create_tool_calling_agent, or a custom callback handler, start by logging message flow. In most cases, the bug is not in LangChain itself — it’s in how your app replays or retries messages around it.

Keep learning

•The complete AI Agents Roadmap — my full 8-step breakdown
•Free: The AI Agent Starter Kit — PDF checklist + starter code
•Work with me — I build AI for banks and insurance companies

By Cyprian Aarons, AI Consultant at Topiax.

ShareX / Twitter LinkedIn

Want the complete 8-step roadmap?

Grab the free AI Agent Starter Kit — architecture templates, compliance checklists, and a 7-email deep-dive course.

Get the Starter Kit