How to Fix 'tool calling failure in production' in LangGraph (Python)
When LangGraph throws a tool calling failure in production, it usually means the model produced an assistant message that looks like a tool call, but your graph couldn’t execute it cleanly. In practice, this shows up when the agent node, tool node, or message format is out of sync.
This is common in production when you add retries, streaming, structured outputs, or multiple tools. The bug is usually not “LangGraph is broken” — it’s almost always state shape, message handling, or model/tool configuration.
The Most Common Cause
The #1 cause is using the wrong message pattern between the LLM node and the tool execution node.
In LangGraph, the assistant must emit tool_calls, and the next node must be a ToolNode that receives those calls in AIMessage.additional_kwargs / tool_calls. If you manually append messages incorrectly, or return plain text instead of an AI tool call message, you’ll get errors like:
- •
ValueError: No tool calls found in AIMessage - •
TypeError: 'NoneType' object is not iterable - •
ToolInvocationError: Tool execution failed
Broken vs fixed pattern
| Broken | Fixed |
|---|---|
| Manually fabricating assistant output as plain text | Let the model produce real tool calls |
Sending tool results back as HumanMessage | Return ToolMessage from the tool node |
Skipping ToolNode | Use ToolNode(tools) in the graph |
# BROKEN
from langchain_core.messages import HumanMessage
from langgraph.graph import StateGraph, END
def agent_node(state):
# Model says "call search", but this is just text.
return {
"messages": state["messages"] + [
HumanMessage(content="Use search for account 123")
]
}
# FIXED
from langchain_core.messages import HumanMessage
from langgraph.prebuilt import ToolNode
from langgraph.graph import StateGraph, END
tools = [search_tool]
tool_node = ToolNode(tools)
def agent_node(state):
ai_msg = llm.bind_tools(tools).invoke(state["messages"])
return {"messages": state["messages"] + [ai_msg]}
If you are using create_react_agent, do not manually rebuild the message flow. That helper already wires the tool loop correctly.
from langgraph.prebuilt import create_react_agent
agent = create_react_agent(llm, tools)
result = agent.invoke({"messages": [HumanMessage(content="Look up policy 8831")]})
Other Possible Causes
1) Your tool schema does not match the function signature
If your Pydantic args schema says one thing and your Python function expects another, execution fails after planning.
from pydantic import BaseModel
class SearchArgs(BaseModel):
query: str
@tool(args_schema=SearchArgs)
def search_tool(term: str): # mismatch: query vs term
return f"Searching {term}"
Fix it by matching names exactly.
@tool(args_schema=SearchArgs)
def search_tool(query: str):
return f"Searching {query}"
2) The model does not support tool calling
Some chat models can generate text but don’t emit valid tool calls. In that case LangGraph never gets a usable AIMessage.tool_calls.
llm = ChatOpenAI(model="gpt-3.5-turbo") # may work poorly depending on setup
Use a model with reliable tool calling support and bind tools explicitly.
llm = ChatOpenAI(model="gpt-4o-mini").bind_tools(tools)
3) You are dropping message history in state transitions
A common production bug is returning only the latest message and losing prior context. Then the next node sees an incomplete state and cannot resolve the call chain.
# BROKEN: overwrites history
def route(state):
return {"messages": [state["messages"][-1]]}
Keep full history unless you intentionally trim it.
# FIXED: preserve conversation state
def route(state):
return {"messages": state["messages"]}
4) Your conditional edge routing skips the tool node
If your router sends control to END or another node before tools execute, you’ll see unresolved calls or empty tool responses.
# BROKEN routing example
graph.add_conditional_edges("agent", should_continue, {
"end": END,
"tool": "tools",
})
Make sure your router returns "tool" whenever ai_msg.tool_calls exists.
def should_continue(state):
last = state["messages"][-1]
return "tool" if getattr(last, "tool_calls", None) else "end"
How to Debug It
- •
Inspect the last AI message
- •Print
type(last_message)andlast_message.tool_calls. - •If
tool_callsis empty or missing, your LLM never emitted a valid call.
- •Print
- •
Check whether your graph actually reaches
ToolNode- •Add logging in each node.
- •If the agent emits a call but tools never run, your routing logic is wrong.
- •
Validate tool signatures and schemas
- •Confirm parameter names match exactly.
- •Check for optional fields that became required after deployment.
- •Look for serialization issues with nested objects.
- •
Reproduce with one tool and one turn
- •Remove retries, memory trimming, streaming, and parallel tools.
- •Run a single prompt like:
result = graph.invoke({"messages": [HumanMessage(content="Call search for claim 4482")]}) - •If that works locally but fails in prod, compare model version and environment variables first.
Prevention
- •Use
create_react_agentunless you have a strong reason to build the loop manually. - •Add assertions before routing:
- •verify last message is an
AIMessage - •verify
tool_callsexists before sending to tools
- •verify last message is an
- •Keep tool schemas simple:
- •flat JSON fields first
- •no optional nesting unless necessary
- •Log these fields in production:
- •last message class name
- •
tool_calls - •selected route
- •raw tool input payload
Keep learning
- •The complete AI Agents Roadmap — my full 8-step breakdown
- •Free: The AI Agent Starter Kit — PDF checklist + starter code
- •Work with me — I build AI for banks and insurance companies
By Cyprian Aarons, AI Consultant at Topiax.
Want the complete 8-step roadmap?
Grab the free AI Agent Starter Kit — architecture templates, compliance checklists, and a 7-email deep-dive course.
Get the Starter Kit