How to Fix 'duplicate tool calls in production' in AutoGen (Python)

By Cyprian AaronsUpdated 2026-04-21

duplicate-tool-calls-in-productionautogenpython

What the error means

If you’re seeing duplicate tool calls in production, AutoGen is telling you the same tool invocation is being emitted more than once for a single logical turn. In practice, this usually shows up when the agent retries, your message history is replayed incorrectly, or your app processes the same LLM response twice.

This is common in Python deployments where AutoGen is wrapped by FastAPI, Celery, background workers, or any queue that can re-run a task after a timeout.

The Most Common Cause

The #1 cause is reusing the same assistant response or message history across multiple executions and then calling run() / initiate_chat() again without clearing state. In AutoGen, AssistantAgent and UserProxyAgent are stateful enough that bad orchestration can replay the same tool call.

Here’s the broken pattern:

Broken	Fixed
Reuse the same conversation state across requests	Create a fresh chat context per request
Call the agent twice on the same input path	Make tool execution idempotent and dedupe by request ID

# BROKEN
from autogen import AssistantAgent, UserProxyAgent

assistant = AssistantAgent(
    name="assistant",
    llm_config={"config_list": [{"model": "gpt-4o-mini", "api_key": "..." }]}
)

user_proxy = UserProxyAgent(
    name="user",
    human_input_mode="NEVER",
    code_execution_config=False
)

# request 1
result1 = user_proxy.initiate_chat(
    assistant,
    message="Check policy status for claim 123"
)

# request 2 reuses the same agents and may replay prior tool messages
result2 = user_proxy.initiate_chat(
    assistant,
    message="Check policy status for claim 123"
)

# FIXED
from autogen import AssistantAgent, UserProxyAgent

def build_agents():
    assistant = AssistantAgent(
        name="assistant",
        llm_config={"config_list": [{"model": "gpt-4o-mini", "api_key": "..."}]}
    )
    user_proxy = UserProxyAgent(
        name="user",
        human_input_mode="NEVER",
        code_execution_config=False
    )
    return assistant, user_proxy

def handle_request(prompt: str):
    assistant, user_proxy = build_agents()

    # fresh agents per request; no stale tool history
    return user_proxy.initiate_chat(assistant, message=prompt)

handle_request("Check policy status for claim 123")

If you’re using GroupChat / GroupChatManager, the same rule applies: don’t reuse a manager with old messages unless you explicitly reset it.

Other Possible Causes

1) Your webhook or worker retries the same job

If your API endpoint times out after sending a tool call, your queue worker may retry and execute it again. This is especially common with Celery acks_late=True, HTTP retries, or load balancer timeouts.

# example: duplicate processing from retries
@app.post("/chat")
def chat(req: ChatRequest):
    task_id = req.request_id  # must be stable
    if already_processed(task_id):
        return {"status": "duplicate_ignored"}

    mark_processing(task_id)
    return run_autogen(req.prompt)

Use a stable request ID and store processed IDs in Redis or your DB.

2) The model emitted repeated tool calls in one response

Some models will generate multiple identical function calls if your prompt is loose or if the tool schema doesn’t constrain arguments well. In AutoGen this often surfaces as repeated FunctionCall / ToolCall entries in the assistant message.

llm_config = {
    "config_list": [{"model": "gpt-4o-mini", "api_key": "..."}],
    "temperature": 0,
}

Lowering temperature helps a bit. Better fix: tighten the prompt and make the tool contract explicit.

3) Your tool function is not idempotent

If your tool writes to a database, sends an email, or creates a case record, duplicate execution becomes visible immediately. AutoGen may have only one logical call from its perspective, but your side effect makes it look like duplicates.

def create_claim_note(claim_id: str, note: str):
    existing = db.notes.find_one({"claim_id": claim_id, "note_hash": hash(note)})
    if existing:
        return {"status": "duplicate_skipped"}

    db.notes.insert_one({"claim_id": claim_id, "note": note})
    return {"status": "created"}

Make every external action idempotent using a unique business key.

4) You are appending messages manually and duplicating assistant output

This happens when you push both the raw model response and AutoGen’s parsed tool message into history. Then on the next turn AutoGen sees two copies of the same assistant action.

# BAD: don't append both raw and parsed versions
messages.append(raw_response)
messages.append(parsed_tool_message)

Only let one layer own conversation state. If AutoGen manages messages, don’t also mirror them in your own list unless you know exactly how they’ll be consumed.

How to Debug It

•
Log every message before it reaches AutoGen
- •Print role, content, tool name, and call ID.
- •Look for repeated tool_calls, repeated FunctionCall, or duplicated assistant turns.
•
Check whether your request handler runs twice
- •Add a request ID to logs.
- •If you see two identical IDs producing two executions, this is an infrastructure retry problem, not an AutoGen bug.
•
Inspect agent state between turns
- •If you reuse AssistantAgent, UserProxyAgent, or GroupChatManager, dump their stored messages.
- •Any old tool call still present in memory can be replayed on the next run.
•
Make one tool temporarily no-op
- •Replace side-effecting tools with logging only.
- •If duplicates disappear from downstream systems but still appear in logs, your issue is idempotency or retry handling.

Prevention

•Create fresh agent instances per request unless you intentionally manage long-lived conversations.
•Add idempotency keys to every external side effect: DB writes, tickets, emails, claims updates.
•Store and dedupe by (request_id, tool_name, arguments_hash) before executing any tool.
•Keep prompts strict about when tools should run and what counts as a valid call.
•Treat every networked entry point as retryable until proven otherwise.

If you want one practical rule: AutoGen chat state should be ephemeral; business actions should be idempotent. That’s what stops duplicate tool calls from turning into duplicate production incidents.

Keep learning

•The complete AI Agents Roadmap — my full 8-step breakdown
•Free: The AI Agent Starter Kit — PDF checklist + starter code
•Work with me — I build AI for banks and insurance companies

By Cyprian Aarons, AI Consultant at Topiax.

ShareX / Twitter LinkedIn

Want the complete 8-step roadmap?

Grab the free AI Agent Starter Kit — architecture templates, compliance checklists, and a 7-email deep-dive course.

Get the Starter Kit