How to Fix 'tool calling failure' in LlamaIndex (Python)

By Cyprian AaronsUpdated 2026-04-21

tool-calling-failurellamaindexpython

When you see tool calling failure in LlamaIndex, it usually means the LLM was asked to call a tool but returned something the framework could not parse or execute. In practice, this shows up when using FunctionAgent, ReActAgent, or any agent wrapped around tools where the model output does not match the tool-calling contract.

The error is rarely “the tool is broken.” It’s usually one of three things: the model does not support tool calling, the tool schema is malformed, or your prompt/agent setup is forcing a response format the model can’t satisfy.

The Most Common Cause

The #1 cause is using an LLM that does not support structured tool calling, or configuring LlamaIndex to use it as if it does.

Typical symptoms:

•ValueError: Tool calling failed
•NotImplementedError: Function calling not supported by this LLM
•OpenAI API error: model does not support tools

Broken vs fixed pattern

Broken	Fixed
Uses a plain completion model for an agent that expects tools	Uses a tool-capable chat model
Passes raw text-only responses into a tool agent	Lets LlamaIndex receive structured tool calls
Mixes incompatible model + agent type	Matches model capabilities to agent type

# BROKEN
from llama_index.core.agent import ReActAgent
from llama_index.llms.openai import OpenAI

llm = OpenAI(model="gpt-3.5-turbo-instruct")  # completion-style, bad fit for tool calling
agent = ReActAgent.from_tools(
    tools=[my_tool],
    llm=llm,
)

response = agent.chat("Check policy status for policy 123")

# FIXED
from llama_index.core.agent import FunctionAgent
from llama_index.llms.openai import OpenAI

llm = OpenAI(model="gpt-4o-mini")  # chat model with tool calling support
agent = FunctionAgent.from_tools(
    tools=[my_tool],
    llm=llm,
)

response = agent.chat("Check policy status for policy 123")

If you are on older LlamaIndex versions, the exact class names may differ, but the rule stays the same: use a chat model that supports tools with an agent that expects tool calls.

Other Possible Causes

1) Your tool signature is invalid or not JSON-safe

LlamaIndex builds a schema from your Python function signature. If you use unsupported types, weird defaults, or missing type hints, the schema can break.

# BROKEN
def lookup_claim(claim_id, meta={}):  # no type hints, mutable default
    return {"claim_id": claim_id}

# FIXED
def lookup_claim(claim_id: str) -> dict:
    return {"claim_id": claim_id}

If you need richer input, define explicit fields with simple types.

2) You wrapped a non-tool-capable LLM in an agent

Some providers expose text generation but no native function/tool calling. LlamaIndex cannot invent tool support for them.

# BROKEN
from llama_index.llms.huggingface import HuggingFaceInferenceAPI

llm = HuggingFaceInferenceAPI(model_name="some-text-only-model")

Use a provider/model that explicitly supports tools, or switch to a workflow that does not depend on native function calls.

3) The prompt is forcing free-form output

If your system prompt tells the model to “always answer in plain English” while the agent expects a tool call, you get conflicts. This often produces messages like:

•Could not parse tool call from response
•tool calling failure
•No valid function call found

# BROKEN PROMPT IDEA
system_prompt = """
Answer only in prose.
Never emit JSON.
Never call functions.
"""

# FIXED PROMPT IDEA
system_prompt = """
Use tools when needed.
Return normal assistant text only after completing any required tool calls.
"""

For agents, keep prompts aligned with the execution mode. Don’t fight the framework.

4) Tool names or descriptions are colliding

If two tools have similar names or vague descriptions, models sometimes choose incorrectly or emit malformed calls.

# RISKY
@tool(name="get_data")
def get_policy_data(policy_id: str): ...

@tool(name="get_data")
def get_claim_data(claim_id: str): ...

Make names unique and descriptions specific.

# BETTER
@tool(name="get_policy_status")
def get_policy_status(policy_id: str): ...

@tool(name="get_claim_details")
def get_claim_details(claim_id: str): ...

How to Debug It

•
Check whether your model supports tools
- •Confirm the exact provider/model in your config.
- •If you see Function calling not supported, stop debugging the agent and swap models first.
•
Print the generated tool schema
- •Look at your function annotations and defaults.
- •If you have nested custom objects, lists of unions, or missing type hints, simplify them.
•
Run one tool with one prompt
- •Remove all but one tool.
- •Use a direct prompt like: "Call get_policy_status for policy_id=123".
•
Inspect raw responses before parsing
- •If possible, log the assistant message returned by the LLM.
- •
  You want to see whether it emitted:
  - •a valid structured tool call,
  - •plain text,
  - •or malformed JSON-like output.

A practical debug checklist:

•llm.model is actually tool-capable
•Tool function has clean type hints
•
Agent class matches your use case:
- •FunctionAgent for native tool calling
- •ReActAgent when reasoning + text-based action loops are intended
•Prompt does not conflict with structured output

Prevention

•Use only models that explicitly support function/tool calling for production agents.
•
Keep tools small and typed:
- •simple arguments like str, int, bool
- •no ambiguous dict blobs unless necessary
•
Add startup validation:
- •verify model capability
- •verify every registered tool has type hints and unique names

If you’re building bank or insurance workflows, treat this as a contract problem, not an AI problem. The fix is usually making sure your agent, prompt, and model all agree on how a tool call should look before you ever hit runtime.

Keep learning

•The complete AI Agents Roadmap — my full 8-step breakdown
•Free: The AI Agent Starter Kit — PDF checklist + starter code
•Work with me — I build AI for banks and insurance companies

By Cyprian Aarons, AI Consultant at Topiax.

ShareX / Twitter LinkedIn

Want the complete 8-step roadmap?

Grab the free AI Agent Starter Kit — architecture templates, compliance checklists, and a 7-email deep-dive course.

Get the Starter Kit