How to Fix 'duplicate tool calls' in LangChain (Python)
What this error means
duplicate tool calls in LangChain usually means the model produced the same tool invocation more than once, and your agent/runtime tried to execute or register it twice. You’ll typically see it when using AgentExecutor, create_tool_calling_agent, or a chat model with tool calling enabled and some retry/streaming logic layered on top.
In practice, this shows up when you reuse message history incorrectly, append AI messages manually, or let both LangChain and your own code drive tool execution.
The Most Common Cause
The #1 cause is replaying the same assistant/tool messages back into the next agent run.
This happens a lot when people store conversation state in a list, then feed the entire list back into AgentExecutor.invoke() without separating:
- •user input
- •assistant tool-call messages
- •tool results
Broken pattern vs fixed pattern
| Broken | Fixed |
|---|---|
Reuses prior AIMessage with tool_calls | Passes only clean chat history or uses proper memory |
| Manually appends tool messages | Lets LangChain manage the tool loop |
| Re-invokes agent with already-consumed messages | Starts each run from fresh input state |
# BROKEN
from langchain_openai import ChatOpenAI
from langchain_core.messages import HumanMessage, AIMessage, ToolMessage
from langchain.agents import create_tool_calling_agent, AgentExecutor
llm = ChatOpenAI(model="gpt-4o-mini")
tools = [get_weather_tool]
agent = create_tool_calling_agent(llm, tools, prompt)
executor = AgentExecutor(agent=agent, tools=tools)
messages = [
HumanMessage(content="What's the weather in Nairobi?"),
]
result = executor.invoke({"input": messages})
# Bad: storing the assistant response including tool_calls
messages.append(result["output"])
# Bad: invoking again with already-used agent state
result2 = executor.invoke({"input": messages})
# FIXED
from langchain_openai import ChatOpenAI
from langchain_core.messages import HumanMessage
from langchain.agents import create_tool_calling_agent, AgentExecutor
llm = ChatOpenAI(model="gpt-4o-mini")
tools = [get_weather_tool]
agent = create_tool_calling_agent(llm, tools, prompt)
executor = AgentExecutor(agent=agent, tools=tools)
# Keep only user-facing conversation state here
chat_history = [
HumanMessage(content="What's the weather in Nairobi?")
]
result = executor.invoke({"input": "What's the weather in Nairobi?"})
# If you need persistence, store the final text output,
# not raw internal tool-call messages.
chat_history.append(HumanMessage(content="Thanks"))
The key rule: don’t replay internal tool-call messages unless you know exactly how your agent is expecting them. In most apps, you should persist user/assistant text separately from LangChain’s internal tool execution trace.
Other Possible Causes
1) You’re using both manual tool execution and an agent executor
If your code calls the tool directly and also lets AgentExecutor run it, you’ll get duplicate execution behavior.
# BROKEN
tool_result = get_weather_tool.invoke({"location": "Nairobi"})
response = executor.invoke({"input": "What's the weather in Nairobi?"})
# FIXED
response = executor.invoke({"input": "What's the weather in Nairobi?"})
Pick one control plane:
- •either let LangChain handle tools through the agent loop
- •or call tools yourself and skip agent tool calling
2) Streaming callback code is re-processing partial chunks
With streaming models like ChatOpenAI(streaming=True), your callback handler may see partial deltas multiple times. If you treat each chunk as a full tool call, you can accidentally submit duplicates.
llm = ChatOpenAI(model="gpt-4o-mini", streaming=True)
# Bad: treating every streamed chunk as a complete response
for chunk in llm.stream("Check status"):
handle_tool_call(chunk)
Fix by only acting on finalized tool-call events or by buffering until completion.
3) Retry logic is re-sending non-idempotent requests
If you wrapped agent execution in retries and a timeout occurs after a tool has already executed, the retry can repeat the same call.
from tenacity import retry
@retry()
def run_agent():
return executor.invoke({"input": "Create ticket for customer 123"})
If that request creates side effects, make the downstream operation idempotent:
- •use request IDs
- •dedupe on backend side
- •avoid blind retries around non-idempotent tools
4) Your prompt encourages repeated tool usage
Some prompts cause the model to keep “thinking” it needs to call the same function again. This is common when instructions are vague or when you don’t clearly tell the model when to stop.
prompt = ChatPromptTemplate.from_messages([
("system", "Use tools whenever useful."),
("human", "{input}")
])
Tighten it:
prompt = ChatPromptTemplate.from_messages([
("system", "Call each required tool at most once per user request. "
"Do not repeat a tool call if you already have its result."),
("human", "{input}")
])
How to Debug It
- •
Inspect the raw message list before invocation
- •Log every
HumanMessage,AIMessage, andToolMessage - •Look for repeated
tool_callspayloads or duplicated assistant turns
- •Log every
- •
Turn off retries temporarily
- •Remove
tenacity, HTTP retries, and wrapper-level retries - •If the error disappears, your retry path is replaying state
- •Remove
- •
Disable streaming
- •Run with non-streaming mode first:
llm = ChatOpenAI(model="gpt-4o-mini", streaming=False) - •If duplicates vanish, your callback/event handling is likely double-counting chunks
- •Run with non-streaming mode first:
- •
Check whether your app stores internal LangChain messages
- •Don’t persist raw
AIMessage(tool_calls=[...])unless your resume logic explicitly supports it - •Persist clean user/assistant text separately from execution metadata
- •Don’t persist raw
Prevention
- •
Use one owner for tool execution:
- •either LangChain agents
- •or manual orchestration
Not both at once.
- •
Keep persisted chat history clean:
- •store final assistant text
- •avoid replaying internal
tool_callsand intermediateToolMessages unless required
- •
Make tools idempotent where possible:
- •add request IDs
- •dedupe on backend writes
- •design retries so they don’t create duplicate side effects
If you’re seeing this with AgentExecutor, create_tool_calling_agent, or a custom callback handler, start by logging message flow. In most cases, the bug is not in LangChain itself — it’s in how your app replays or retries messages around it.
Keep learning
- •The complete AI Agents Roadmap — my full 8-step breakdown
- •Free: The AI Agent Starter Kit — PDF checklist + starter code
- •Work with me — I build AI for banks and insurance companies
By Cyprian Aarons, AI Consultant at Topiax.
Want the complete 8-step roadmap?
Grab the free AI Agent Starter Kit — architecture templates, compliance checklists, and a 7-email deep-dive course.
Get the Starter Kit