How to Fix 'tool calling failure during development' in LlamaIndex (Python)
What this error means
tool calling failure during development usually means LlamaIndex tried to execute a tool call from the LLM, but the request/response shape did not match what the tool runner expected. In practice, this shows up when you’re wiring FunctionTool, ReActAgent, or an OpenAI-style function calling model and the model returns malformed tool arguments, the tool signature is wrong, or your provider/model does not actually support tool calling the way LlamaIndex expects.
The symptom is often a stack trace ending in something like:
- •
ValueError: Tool calling failure during development - •
ValidationErrorfrom Pydantic - •
OpenAI API error: invalid_request_error - •
tool_callsmissing or malformed in the assistant response
The Most Common Cause
The #1 cause is a mismatch between your Python function signature and what the LLM sends as tool arguments.
LlamaIndex converts your Python callable into a schema. If the model emits arguments that don’t match that schema, you get a tool execution failure. This happens a lot when people use positional-only params, unsupported types, or forget to make parameters explicit.
Broken vs fixed
| Broken pattern | Right pattern |
|---|---|
| ```python | |
| from llama_index.core.tools import FunctionTool | |
| from llama_index.core.agent import ReActAgent |
def lookup_customer(customer_id, include_history=False): return {"id": customer_id, "history": [] if not include_history else ["paid"]}
tool = FunctionTool.from_defaults(fn=lookup_customer)
agent = ReActAgent.from_tools([tool], verbose=True)
response = agent.chat("Find customer 123 and include history")
|python
from typing import Annotated
from llama_index.core.tools import FunctionTool
from llama_index.core.agent import ReActAgent
def lookup_customer( customer_id: Annotated[str, "Customer ID"], include_history: Annotated[bool, "Include account history"] = False, ): return {"id": customer_id, "history": [] if not include_history else ["paid"]}
tool = FunctionTool.from_defaults(fn=lookup_customer)
agent = ReActAgent.from_tools([tool], verbose=True) response = agent.chat("Find customer 123 and include history")
Why this breaks:
- LLMs are much better at filling named parameters than guessing ambiguous signatures.
- `Annotated[...]` gives LlamaIndex a cleaner schema.
- If your function expects `int` but the model sends `"123"` as a string, Pydantic validation can fail depending on config.
A more realistic broken case is returning or accepting unsupported shapes:
```python
def create_claim(payload):
# payload is an untyped dict with nested objects and datetime instances
return process_claim(payload)
Fix it by making the schema explicit:
from pydantic import BaseModel
class ClaimRequest(BaseModel):
policy_id: str
loss_amount: float
def create_claim(payload: ClaimRequest):
return process_claim(payload.model_dump())
Other Possible Causes
1. Your model does not support tool calling properly
Some models can chat but do not reliably emit structured tool calls. In LlamaIndex this often shows up when using a non-function-calling model with an agent that expects one.
# Problematic if the model doesn't support tools well
llm = OpenAI(model="gpt-3.5-turbo") # depending on provider/version behavior varies
Use a model known to support tool calls in your stack:
llm = OpenAI(model="gpt-4o-mini")
If you’re using another provider, verify that LlamaIndex’s wrapper supports tool_calls for that exact backend.
2. The prompt causes the model to skip structured output
If your system prompt encourages free-form answers instead of tool use, the agent may never emit a valid call.
system_prompt = "Answer naturally and do not use tools unless absolutely necessary."
Prefer clear tool instructions:
system_prompt = """
You must use available tools when user requests data lookup or action execution.
Return only valid tool calls when appropriate.
"""
With ReAct-style agents, overly creative prompts can produce malformed reasoning traces too.
3. Your tool returns non-serializable data
LlamaIndex may fail after a successful call if the return value cannot be serialized cleanly into the agent transcript.
def get_policy(policy_id: str):
return PolicyObject(...) # custom class instance
Return JSON-friendly data:
def get_policy(policy_id: str):
policy = PolicyObject(...)
return {
"policy_id": policy.policy_id,
"status": policy.status,
"premium": policy.premium,
}
4. Version mismatch between LlamaIndex and provider SDKs
This one bites teams hard during development. You upgrade llama-index, but keep an older OpenAI SDK or vice versa.
Check for mismatched versions:
pip show llama-index openai pydantic
A safe fix is to align them deliberately:
pip install -U llama-index openai pydantic
If you pin versions in production, pin them together in lockstep and test tool flows after every bump.
How to Debug It
- •
Turn on verbose agent logging
agent = ReActAgent.from_tools([tool], verbose=True)Look for whether the model emitted a tool name and arguments before failing.
- •
Inspect the raw schema generated for your tool
print(tool.metadata)Check parameter names, defaults, and types. If you see vague
Anytypes or missing descriptions, tighten the signature. - •
Call the function directly with fake model-shaped input
lookup_customer(customer_id="123", include_history="true")If this fails locally, your schema is too strict or your types are wrong.
- •
Swap in a known-good model Test with a provider/model that has reliable function calling:
llm = OpenAI(model="gpt-4o-mini")If the error disappears, your original model/provider wrapper is the issue.
Prevention
- •Use explicit typed signatures for every tool.
- •Prefer
str,int,bool,list[str], or Pydantic models over loosedict/Any.
- •Prefer
- •Keep tool outputs JSON-serializable.
- •Return dicts, lists, strings, numbers, and booleans.
- •Pin compatible versions of:
- •
llama-index - •provider SDKs like
openai - •
pydantic
- •
- •Add one integration test per critical agent path.
- •Mock a real user request and verify that tool invocation succeeds end-to-end before shipping.
Keep learning
- •The complete AI Agents Roadmap — my full 8-step breakdown
- •Free: The AI Agent Starter Kit — PDF checklist + starter code
- •Work with me — I build AI for banks and insurance companies
By Cyprian Aarons, AI Consultant at Topiax.
Want the complete 8-step roadmap?
Grab the free AI Agent Starter Kit — architecture templates, compliance checklists, and a 7-email deep-dive course.
Get the Starter Kit