How to Fix 'tool calling failure in production' in LangGraph (Python)

By Cyprian AaronsUpdated 2026-04-21

tool-calling-failure-in-productionlanggraphpython

When LangGraph throws a tool calling failure in production, it usually means the model produced an assistant message that looks like a tool call, but your graph couldn’t execute it cleanly. In practice, this shows up when the agent node, tool node, or message format is out of sync.

This is common in production when you add retries, streaming, structured outputs, or multiple tools. The bug is usually not “LangGraph is broken” — it’s almost always state shape, message handling, or model/tool configuration.

The Most Common Cause

The #1 cause is using the wrong message pattern between the LLM node and the tool execution node.

In LangGraph, the assistant must emit tool_calls, and the next node must be a ToolNode that receives those calls in AIMessage.additional_kwargs / tool_calls. If you manually append messages incorrectly, or return plain text instead of an AI tool call message, you’ll get errors like:

•ValueError: No tool calls found in AIMessage
•TypeError: 'NoneType' object is not iterable
•ToolInvocationError: Tool execution failed

Broken vs fixed pattern

Broken	Fixed
Manually fabricating assistant output as plain text	Let the model produce real tool calls
Sending tool results back as `HumanMessage`	Return `ToolMessage` from the tool node
Skipping `ToolNode`	Use `ToolNode(tools)` in the graph

# BROKEN
from langchain_core.messages import HumanMessage
from langgraph.graph import StateGraph, END

def agent_node(state):
    # Model says "call search", but this is just text.
    return {
        "messages": state["messages"] + [
            HumanMessage(content="Use search for account 123")
        ]
    }

# FIXED
from langchain_core.messages import HumanMessage
from langgraph.prebuilt import ToolNode
from langgraph.graph import StateGraph, END

tools = [search_tool]
tool_node = ToolNode(tools)

def agent_node(state):
    ai_msg = llm.bind_tools(tools).invoke(state["messages"])
    return {"messages": state["messages"] + [ai_msg]}

If you are using create_react_agent, do not manually rebuild the message flow. That helper already wires the tool loop correctly.

from langgraph.prebuilt import create_react_agent

agent = create_react_agent(llm, tools)
result = agent.invoke({"messages": [HumanMessage(content="Look up policy 8831")]})

Other Possible Causes

1) Your tool schema does not match the function signature

If your Pydantic args schema says one thing and your Python function expects another, execution fails after planning.

from pydantic import BaseModel

class SearchArgs(BaseModel):
    query: str

@tool(args_schema=SearchArgs)
def search_tool(term: str):   # mismatch: query vs term
    return f"Searching {term}"

Fix it by matching names exactly.

@tool(args_schema=SearchArgs)
def search_tool(query: str):
    return f"Searching {query}"

2) The model does not support tool calling

Some chat models can generate text but don’t emit valid tool calls. In that case LangGraph never gets a usable AIMessage.tool_calls.

llm = ChatOpenAI(model="gpt-3.5-turbo")  # may work poorly depending on setup

Use a model with reliable tool calling support and bind tools explicitly.

llm = ChatOpenAI(model="gpt-4o-mini").bind_tools(tools)

3) You are dropping message history in state transitions

A common production bug is returning only the latest message and losing prior context. Then the next node sees an incomplete state and cannot resolve the call chain.

# BROKEN: overwrites history
def route(state):
    return {"messages": [state["messages"][-1]]}

Keep full history unless you intentionally trim it.

# FIXED: preserve conversation state
def route(state):
    return {"messages": state["messages"]}

4) Your conditional edge routing skips the tool node

If your router sends control to END or another node before tools execute, you’ll see unresolved calls or empty tool responses.

# BROKEN routing example
graph.add_conditional_edges("agent", should_continue, {
    "end": END,
    "tool": "tools",
})

Make sure your router returns "tool" whenever ai_msg.tool_calls exists.

def should_continue(state):
    last = state["messages"][-1]
    return "tool" if getattr(last, "tool_calls", None) else "end"

How to Debug It

•
Inspect the last AI message
- •Print type(last_message) and last_message.tool_calls.
- •If tool_calls is empty or missing, your LLM never emitted a valid call.
•
Check whether your graph actually reaches ToolNode
- •Add logging in each node.
- •If the agent emits a call but tools never run, your routing logic is wrong.
•
Validate tool signatures and schemas
- •Confirm parameter names match exactly.
- •Check for optional fields that became required after deployment.
- •Look for serialization issues with nested objects.
•
Reproduce with one tool and one turn
- •Remove retries, memory trimming, streaming, and parallel tools.
- •
  Run a single prompt like:
```
result = graph.invoke({"messages": [HumanMessage(content="Call search for claim 4482")]})
```
- •If that works locally but fails in prod, compare model version and environment variables first.

Prevention

•Use create_react_agent unless you have a strong reason to build the loop manually.
•
Add assertions before routing:
- •verify last message is an AIMessage
- •verify tool_calls exists before sending to tools
•
Keep tool schemas simple:
- •flat JSON fields first
- •no optional nesting unless necessary
•
Log these fields in production:
- •last message class name
- •tool_calls
- •selected route
- •raw tool input payload

Keep learning

•The complete AI Agents Roadmap — my full 8-step breakdown
•Free: The AI Agent Starter Kit — PDF checklist + starter code
•Work with me — I build AI for banks and insurance companies

By Cyprian Aarons, AI Consultant at Topiax.

ShareX / Twitter LinkedIn

Want the complete 8-step roadmap?

Grab the free AI Agent Starter Kit — architecture templates, compliance checklists, and a 7-email deep-dive course.

Get the Starter Kit