How to Fix 'tool calling failure in production' in LangChain (Python)
If you’re seeing tool calling failure in production in LangChain, it usually means the model returned something LangChain could not convert into a valid tool call. In practice, this shows up when the prompt, model, or tool schema are out of sync, or when you’re using a model that doesn’t reliably support structured tool calls.
This is common in agent flows, especially after a deployment where a prompt changed, a provider model got swapped, or the tool schema evolved without matching code updates.
The Most Common Cause
The #1 cause is using a model that is not actually bound to tools correctly, or binding tools but then calling the wrong chain path.
A very common failure looks like this:
| Broken pattern | Fixed pattern |
|---|---|
| Model is asked to call tools, but tools are never bound | Model is explicitly bound with bind_tools() |
Agent expects structured tool calls, but plain invoke() is used incorrectly | Use AgentExecutor or the proper agent runner |
| Tool schema exists, but model output is free-form text | Force tool-capable chat model and proper tool binding |
Broken code
from langchain_openai import ChatOpenAI
from langchain_core.tools import tool
from langchain.agents import create_tool_calling_agent, AgentExecutor
from langchain_core.prompts import ChatPromptTemplate
@tool
def get_policy_status(policy_id: str) -> str:
"""Get policy status by policy ID."""
return f"Policy {policy_id} is active"
llm = ChatOpenAI(model="gpt-3.5-turbo") # not ideal for tool calling reliability
prompt = ChatPromptTemplate.from_messages([
("system", "You are a helpful assistant."),
("human", "{input}"),
])
agent = create_tool_calling_agent(llm, [get_policy_status], prompt)
executor = AgentExecutor(agent=agent, tools=[get_policy_status])
# This often fails with tool calling issues in production
result = executor.invoke({"input": "Check policy 123"})
print(result)
Typical runtime symptoms include messages like:
- •
LangChainError: Tool calling failed - •
InvalidToolCall - •
OutputParserException: Could not parse LLM output - •
ToolMessage expected but not found
Fixed code
from langchain_openai import ChatOpenAI
from langchain_core.tools import tool
from langchain.agents import create_tool_calling_agent, AgentExecutor
from langchain_core.prompts import ChatPromptTemplate
@tool
def get_policy_status(policy_id: str) -> str:
"""Get policy status by policy ID."""
return f"Policy {policy_id} is active"
llm = ChatOpenAI(model="gpt-4o-mini", temperature=0).bind_tools([get_policy_status])
prompt = ChatPromptTemplate.from_messages([
("system", "You are a helpful assistant. Use tools when needed."),
("human", "{input}"),
])
agent = create_tool_calling_agent(llm, [get_policy_status], prompt)
executor = AgentExecutor(agent=agent, tools=[get_policy_status], verbose=True)
result = executor.invoke({"input": "Check policy 123"})
print(result)
The important change is the explicit .bind_tools([get_policy_status]). Without that, LangChain may build an agent that expects tool-call-capable responses while the model returns plain text.
Other Possible Causes
1. Tool schema mismatch
If your function signature changes but your prompts or downstream parsing still expect the old shape, LangChain can fail to validate the tool call.
# Broken: expects policy_id as int in one place and str in another
@tool
def get_claim(claim_id: int) -> str:
return f"Claim {claim_id}"
# Prompt asks for "claim number" and model sends "ABC-123"
Fix it by keeping the type strict and consistent across all layers.
@tool
def get_claim(claim_id: str) -> str:
return f"Claim {claim_id}"
2. Using a non-tool-capable or poorly configured model
Some models will answer in natural language even when asked to call tools. That leads to parser failures rather than actual tool invocation.
llm = ChatOpenAI(model="gpt-3.5-turbo", temperature=0)
Prefer a model that reliably supports structured tool calls.
llm = ChatOpenAI(model="gpt-4o-mini", temperature=0).bind_tools([my_tool])
3. Missing or malformed prompt instructions
If your system prompt does not clearly instruct the agent to use tools only when appropriate, the model may hallucinate an answer instead of producing a valid tool call.
prompt = ChatPromptTemplate.from_messages([
("system", "Answer questions."),
("human", "{input}"),
])
Better:
prompt = ChatPromptTemplate.from_messages([
("system", "Use available tools for factual lookups. Do not invent values."),
("human", "{input}"),
])
4. Returning unsupported types from tools
LangChain tools should return strings or JSON-serializable content. Returning custom objects often breaks downstream message handling.
@tool
def lookup_customer(customer_id: str):
return Customer(id=customer_id) # broken if Customer isn't serialized properly
Fix:
import json
@tool
def lookup_customer(customer_id: str) -> str:
data = {"id": customer_id, "status": "active"}
return json.dumps(data)
How to Debug It
- •
Turn on verbose logging
- •Use
verbose=TrueonAgentExecutor. - •Inspect whether the model produced an actual
tool_callspayload or just plain text.
- •Use
- •
Print raw LLM output
- •If you’re using callbacks or traces, check whether the response contains:
- •valid tool name
- •valid arguments JSON
- •matching schema fields
- •If you’re using callbacks or traces, check whether the response contains:
- •
Validate your tool signature
- •Confirm parameter names and types match what the prompt implies.
- •Check for renamed arguments like
policyIdvspolicy_id.
- •
Reduce to one tool and one input
- •Remove all extra tools.
- •Test with a single deterministic input.
- •If it works locally but fails in prod, compare model version and environment variables first.
A useful sanity check is to inspect whether your error is coming from parsing or execution:
- •Parsing errors usually show up as
OutputParserExceptionorInvalidToolCall - •Execution errors usually show up after the tool was called successfully
Prevention
- •Bind tools explicitly with
.bind_tools()and use models known to support structured tool calling. - •Keep tool schemas stable; treat function signatures as part of your public contract.
- •Add integration tests that assert both:
- •the agent returns a valid
tool_callsstructure - •the final response works after executing the tool
- •the agent returns a valid
If you’re shipping LangChain agents into production systems like banking or insurance workflows, don’t rely on happy-path demos. Test against real prompts, real schemas, and at least one failure case where the model skips the tool entirely.
Keep learning
- •The complete AI Agents Roadmap — my full 8-step breakdown
- •Free: The AI Agent Starter Kit — PDF checklist + starter code
- •Work with me — I build AI for banks and insurance companies
By Cyprian Aarons, AI Consultant at Topiax.
Want the complete 8-step roadmap?
Grab the free AI Agent Starter Kit — architecture templates, compliance checklists, and a 7-email deep-dive course.
Get the Starter Kit