How to Fix 'tool calling failure' in AutoGen (Python)

By Cyprian AaronsUpdated 2026-04-21
tool-calling-failureautogenpython

When AutoGen throws tool calling failure, it usually means the model tried to call a function/tool, but the request/response shape did not match what AutoGen expected. In practice, this shows up when you wire up AssistantAgent, UserProxyAgent, or a custom tool and the model returns malformed tool arguments, the tool schema is wrong, or the LLM backend does not support tool calls the way AutoGen expects.

The error often appears as one of these variants:

  • autogen.oai.client.OpenAIWrapperException: tool calling failure
  • ValueError: Failed to parse tool call
  • BadRequestError: Invalid tool call arguments
  • TypeError inside your registered function because AutoGen passed unexpected input

The Most Common Cause

The #1 cause is a mismatch between the tool signature and what the model is actually sending.

AutoGen uses your Python function signature to build the tool schema. If your function expects separate positional arguments, but the model sends a single JSON object, or if your function returns something non-serializable, you get a failure.

Broken vs fixed pattern

BrokenFixed
Function expects loose positional argsFunction accepts structured parameters
Tool registration is incompleteTool is explicitly registered with clear schema
Return value is not JSON-safeReturn value is plain text / JSON-serializable
# BROKEN
from autogen import AssistantAgent, UserProxyAgent

def lookup_policy_number(customer_id, region):
    return {"policy": 12345}  # may still fail depending on setup/serialization

assistant = AssistantAgent(
    name="assistant",
    llm_config={"config_list": [{"model": "gpt-4o-mini", "api_key": "YOUR_KEY"}]},
)

user = UserProxyAgent(name="user")

# Model may emit one JSON object, but function expects 2 positional args.
assistant.register_for_llm(name="lookup_policy_number", description="Lookup policy")
user.register_for_execution(name="lookup_policy_number")(lookup_policy_number)
# FIXED
from autogen import AssistantAgent, UserProxyAgent

def lookup_policy_number(customer_id: str, region: str) -> str:
    return f"policy_for_{customer_id}_{region}"

assistant = AssistantAgent(
    name="assistant",
    llm_config={"config_list": [{"model": "gpt-4o-mini", "api_key": "YOUR_KEY"}]},
)

user = UserProxyAgent(name="user")

assistant.register_for_llm(
    name="lookup_policy_number",
    description="Lookup policy by customer_id and region",
)

user.register_for_execution(name="lookup_policy_number")(lookup_policy_number)

If you are using newer AutoGen tool APIs, prefer a typed schema style:

from pydantic import BaseModel
from autogen import AssistantAgent

class LookupPolicyArgs(BaseModel):
    customer_id: str
    region: str

def lookup_policy(args: LookupPolicyArgs) -> str:
    return f"policy_for_{args.customer_id}_{args.region}"

That removes ambiguity and makes malformed calls much easier to catch.

Other Possible Causes

1. Tool not exposed to the agent that needs it

A common mistake is registering the function for execution only, but not for LLM planning, or vice versa.

# Wrong: execution only
user.register_for_execution(name="get_balance")(get_balance)

# Right: expose to LLM and execution path
assistant.register_for_llm(name="get_balance", description="Get account balance")
user.register_for_execution(name="get_balance")(get_balance)

If the assistant cannot “see” the tool schema, it will either hallucinate a call or emit a malformed one.

2. Model/backend does not support tool calling properly

Some models expose partial OpenAI compatibility but break on function/tool calls.

Typical symptoms:

  • tool calling failure
  • invalid_request_error
  • empty tool_calls payloads
  • assistant message contains text instead of structured tool call data

Check your config:

llm_config = {
    "config_list": [
        {
            "model": "some-openai-compatible-model",
            "base_url": "http://localhost:8000/v1",
            "api_key": "dummy"
        }
    ]
}

If that backend does not fully support OpenAI-style tools, switch models or disable tools for that agent.

3. Non-serializable return values from your tool

AutoGen needs to pass results back into chat history. Returning objects like database cursors, datetime objects, or custom classes can break serialization.

# Bad
def get_claim(claim_id: str):
    return ClaimObject(claim_id=claim_id)  # custom class instance

# Good
def get_claim(claim_id: str):
    return {
        "claim_id": claim_id,
        "status": "open",
        "updated_at": "2026-04-21T10:00:00Z",
    }

Keep tool outputs as strings, dicts, lists, numbers, booleans, or JSON-safe structures.

4. Function signature uses unsupported parameters

AutoGen can struggle with signatures that include *args, **kwargs, complex nested objects without schema hints, or defaults that don’t map cleanly.

# Risky
def search_claims(*args, **kwargs):
    ...

# Better
def search_claims(query: str, limit: int = 10) -> list[dict]:
    ...

If you need richer input, wrap it in a Pydantic model instead of relying on implicit parsing.

How to Debug It

  1. Inspect the exact exception text

    • Look for whether it fails during parsing, execution, or response handling.
    • Failed to parse tool call points to malformed arguments.
    • TypeError in your function points to signature mismatch.
  2. Print the raw assistant message

    • Check whether AutoGen received a proper structured tool call.
    • If you only see plain text like “I’ll use the tool now,” your backend may not be emitting real tool calls.
  3. Test the function outside AutoGen

    • Call it directly with sample inputs.
    • Verify types and return shape before involving the agent loop.
print(lookup_policy_number("cust_123", "us-east"))
  1. Reduce to one agent and one tool
    • Remove other tools.
    • Remove memory/chat history.
    • Use a single deterministic prompt:
      • “Call lookup_policy_number with customer_id=cust_123, region=us-east.”

That isolates whether the issue is schema generation, backend support, or your business logic.

Prevention

  • Use typed tool inputs with Pydantic models or explicit type hints.
  • Return only JSON-safe values from tools.
  • Keep one clear registration path:
    • expose to LLM planning with register_for_llm
    • bind execution with register_for_execution
  • Validate model/backend support before shipping OpenAI-compatible endpoints into production.
  • Add unit tests for every registered tool:
    • valid input
    • missing field
    • wrong type
    • non-serializable output

Keep learning

By Cyprian Aarons, AI Consultant at Topiax.

Want the complete 8-step roadmap?

Grab the free AI Agent Starter Kit — architecture templates, compliance checklists, and a 7-email deep-dive course.

Get the Starter Kit

Related Guides