AutoGen Tutorial (Python): building custom tools for advanced developers

By Cyprian AaronsUpdated 2026-04-21
autogenbuilding-custom-tools-for-advanced-developerspython

This tutorial shows you how to build custom tools in AutoGen with Python, wire them into an assistant agent, and execute them safely from a user-facing conversation. You need this when the built-in examples stop being enough and you want your agent to call internal logic, validate inputs, or interact with your own services.

What You'll Need

  • Python 3.10+
  • pyautogen installed
  • An OpenAI API key
  • Basic familiarity with AutoGen AssistantAgent and UserProxyAgent
  • A terminal and a working Python virtual environment

Install the package:

pip install pyautogen

Set your API key:

export OPENAI_API_KEY="your-key-here"

Step-by-Step

  1. Start by defining a custom tool as a normal Python function. Keep it deterministic, typed, and easy to test because AutoGen will call it like any other function.
from typing import Annotated

def calculate_risk_score(
    income: Annotated[int, "Annual income in USD"],
    claims: Annotated[int, "Number of claims filed"],
) -> int:
    """Return a simple risk score for demonstration."""
    base = 100
    score = base - (income // 1000) + (claims * 15)
    return max(score, 0)
  1. Register that function with an AssistantAgent. AutoGen can expose Python functions as callable tools when you pass them in tools, which is the cleanest way to let the model choose when to use them.
import os
from autogen import AssistantAgent

llm_config = {
    "model": "gpt-4o-mini",
    "api_key": os.environ["OPENAI_API_KEY"],
}

assistant = AssistantAgent(
    name="insurance_assistant",
    llm_config=llm_config,
    system_message=(
        "You are an insurance operations assistant. "
        "Use tools when they help answer the user accurately."
    ),
    tools=[calculate_risk_score],
)
  1. Create a user proxy that can execute code and handle tool calls. For advanced setups, keep execution local and controlled so you can inspect behavior before connecting real systems.
from autogen import UserProxyAgent

user_proxy = UserProxyAgent(
    name="user_proxy",
    human_input_mode="NEVER",
    code_execution_config={
        "work_dir": "autogen_work",
        "use_docker": False,
    },
)
  1. Send a request that should trigger the tool. The assistant will decide whether to call your function based on the prompt and its system instructions.
message = (
    "Compute a risk score for an applicant with income 85000 "
    "and 2 claims filed. Explain the result."
)

user_proxy.initiate_chat(
    assistant,
    message=message,
)
  1. If you need stricter control, wrap your tool with validation before exposing it. This is the pattern I use when the agent touches business logic that must not accept garbage input.
from typing import Annotated

def calculate_risk_score_strict(
    income: Annotated[int, "Annual income in USD"],
    claims: Annotated[int, "Number of claims filed"],
) -> int:
    if income < 0:
        raise ValueError("income must be non-negative")
    if claims < 0:
        raise ValueError("claims must be non-negative")

    base = 100
    score = base - (income // 1000) + (claims * 15)
    return max(score, 0)
  1. Swap the stricter function into the agent and rerun the same conversation. This gives you a production-friendly path for adding guards without changing how the assistant is used.
assistant = AssistantAgent(
    name="insurance_assistant",
    llm_config=llm_config,
    system_message=(
        "You are an insurance operations assistant. "
        "Use tools when they help answer the user accurately."
    ),
    tools=[calculate_risk_score_strict],
)

user_proxy.initiate_chat(
    assistant,
    message="Compute a risk score for income 85000 and 2 claims.",
)

Testing It

Run the script end to end and confirm that AutoGen prints a tool call or includes the calculated score in the assistant response. If the model answers without using the tool, tighten the system message so it prefers tool usage for numeric or policy-driven questions.

Test invalid inputs next by changing income or claims to negative values inside your prompt or by calling the function directly in Python. You want to see clear failures early instead of silent bad data flowing through an agent workflow.

For deeper verification, add unit tests around your custom function itself before involving AutoGen at all. That keeps your business logic stable even if you later swap models or change agent orchestration.

Next Steps

  • Add multiple tools and let AutoGen choose between them for lookup, calculation, and formatting tasks.
  • Learn how to route tool calls through your own service layer instead of embedding business logic directly in Python functions.
  • Add structured outputs and schema validation so downstream systems can trust what your agent returns.

Keep learning

By Cyprian Aarons, AI Consultant at Topiax.

Want the complete 8-step roadmap?

Grab the free AI Agent Starter Kit — architecture templates, compliance checklists, and a 7-email deep-dive course.

Get the Starter Kit

Related Guides