How to Fix 'tool calling failure when scaling' in LlamaIndex (TypeScript)

By Cyprian AaronsUpdated 2026-04-21
tool-calling-failure-when-scalingllamaindextypescript

When LlamaIndex throws tool calling failure when scaling, it usually means the model returned a tool-call payload that your runtime could not execute or serialize correctly under load. In TypeScript, I usually see this when people scale from one-off prompts to multi-step agents, then hit schema mismatches, missing tool metadata, or concurrency bugs.

The important part: this is rarely “the model is broken.” It’s usually your tool definitions, message format, or async handling.

The Most Common Cause

The #1 cause is a mismatch between the tool schema you registered and the arguments the model tried to call. In LlamaIndex TS, that shows up as failures like:

  • Error: tool calling failure when scaling
  • ToolNotFoundError: Tool "searchDocs" not found
  • ValidationError: Invalid arguments for tool call
  • TypeError: Cannot read properties of undefined (reading 'name')

Here’s the broken pattern I see most often: the function works locally, but the tool metadata doesn’t match what the agent expects.

BrokenFixed
Tool function exists, but name/params are inconsistentTool definition matches schema exactly
Uses ad hoc object shapeUses explicit JSON schema / typed args
Returns non-serializable valuesReturns plain JSON-safe data
// BROKEN
import { FunctionTool } from "llamaindex";

const searchDocs = async ({ query }: { query: string }) => {
  return {
    results: new Map([["a", 1]]), // not JSON-safe
  };
};

const tools = [
  FunctionTool.from(searchDocs, {
    name: "search_documents", // model may call "searchDocs"
    description: "Search internal docs",
  }),
];

// Later: agent asks for searchDocs(query="...")
// Runtime fails with tool resolution / serialization issues
// FIXED
import { FunctionTool } from "llamaindex";

type SearchArgs = {
  query: string;
};

const searchDocs = async ({ query }: SearchArgs) => {
  const results = await internalSearch(query);

  return {
    results: results.map((r) => ({
      id: r.id,
      title: r.title,
      score: r.score,
    })),
  };
};

const tools = [
  FunctionTool.from(searchDocs, {
    name: "searchDocs",
    description: "Search internal docs",
    parameters: {
      type: "object",
      properties: {
        query: { type: "string" },
      },
      required: ["query"],
      additionalProperties: false,
    },
  }),
];

The fix is boring but effective:

  • Keep the tool name stable
  • Define explicit parameters
  • Return plain JSON objects only
  • Don’t let the tool output depend on class instances, Maps, Dates, or circular refs

If you’re using OpenAI-style function calling through LlamaIndex, also make sure the model supports tool calls reliably. Some smaller models will emit malformed arguments under higher load.

Other Possible Causes

1) You’re passing messages in the wrong shape

LlamaIndex expects structured chat messages. If you pass raw strings or mixed message objects, tool execution can fail once the agent tries to route a call.

// BROKEN
await agent.chat([
  "Find policies for travel insurance",
  { role: "assistant", content: "Sure" },
]);
// FIXED
import { ChatMessage, MessageRole } from "llamaindex";

await agent.chat({
  message: new ChatMessage({
    role: MessageRole.USER,
    content: "Find policies for travel insurance",
  }),
});

2) Your tool returns non-serializable data

This gets worse when scaling because one request might succeed and another crashes during JSON encoding.

// BROKEN
return {
  now: new Date(),
  client: someClassInstance,
};
// FIXED
return {
  now: new Date().toISOString(),
  clientId: someClassInstance.id,
};

3) Your async tool throws intermittently under concurrency

If you fan out requests and your downstream API rate-limits you, LlamaIndex may surface it as a generic tool-calling failure.

const fetchPolicy = async ({ policyId }: { policyId: string }) => {
  const res = await fetch(`https://api.internal/policies/${policyId}`);
  if (!res.ok) throw new Error(`Policy API failed with ${res.status}`);
  return await res.json();
};

Fix it by wrapping errors and returning structured failures:

const fetchPolicy = async ({ policyId }: { policyId: string }) => {
  try {
    const res = await fetch(`https://api.internal/policies/${policyId}`);
    if (!res.ok) {
      return { ok: false, error: `Policy API failed with ${res.status}` };
    }
    return { ok: true, data: await res.json() };
  } catch (e) {
    return { ok: false, error: e instanceof Error ? e.message : String(e) };
  }
};

4) You exceeded context or token limits while scaling

When prompts get large, the model may truncate tool-call arguments or skip required fields. That can look like a random failure even though the root cause is prompt bloat.

// BAD IDEA
const longContext = documents.map((d) => d.text).join("\n\n");

Prefer retrieval plus tight context windows:

const topChunks = await retriever.retrieve("travel insurance exclusions");
const context = topChunks.slice(0, 5).map((c) => c.text).join("\n\n");

How to Debug It

  1. Log the exact raw tool call

    • Capture the assistant message before execution.
    • Look for missing name, malformed arguments, or wrong casing.
    • In practice you want to inspect something like:
      • tool_calls[0].function.name
      • tool_calls[0].function.arguments
  2. Validate your tool schema against runtime input

    • Check whether the model is sending { query } while your function expects { q }.
    • Make sure required fields are marked required.
    • Set additionalProperties: false during debugging so bad inputs fail fast.
  3. Remove concurrency

    • Run one request at a time.
    • If it works serially but fails under load, you likely have shared mutable state in your tools.
    • Watch for global caches, reused clients with bad session state, or non-thread-safe wrappers.
  4. Return only primitives first

    • Temporarily simplify every tool response to strings and numbers.
    • If the failure disappears, your issue is serialization.
    • Add fields back one by one until it breaks again.

Prevention

  • Define every tool with an explicit schema and keep its name stable across deployments.
  • Make tool outputs JSON-safe by default; convert Dates, Maps, Sets, and class instances before returning.
  • Add an integration test that runs one full agent loop with real tool calls before shipping changes.

If you’re seeing tool calling failure when scaling in LlamaIndex TypeScript, start with schema mismatch and serialization issues first. That’s where this error comes from most of the time.


Keep learning

By Cyprian Aarons, AI Consultant at Topiax.

Want the complete 8-step roadmap?

Grab the free AI Agent Starter Kit — architecture templates, compliance checklists, and a 7-email deep-dive course.

Get the Starter Kit

Related Guides