AutoGen Tutorial (Python): adding cost tracking for intermediate developers

By Cyprian AaronsUpdated 2026-04-21

autogenadding-cost-tracking-for-intermediate-developerspython

This tutorial shows you how to add per-run cost tracking to an AutoGen Python setup, so you can see how much each agent conversation costs in tokens and dollars. You need this when you move past prototypes and want basic spend visibility for debugging, budgeting, or routing expensive tasks to cheaper models.

What You'll Need

•Python 3.10+
•autogen-agentchat and autogen-ext
•An OpenAI API key set as an environment variable
•A model that supports usage reporting, such as gpt-4o-mini
•Basic familiarity with AutoGen agents and the run() API

Install the packages:

pip install autogen-agentchat autogen-ext openai

Set your API key:

export OPENAI_API_KEY="your-key-here"

Step-by-Step

•Start with a minimal AutoGen assistant agent.
We’ll use a single assistant agent first so the cost tracking is easy to validate before adding more agents or tools.

import asyncio

from autogen_agentchat.agents import AssistantAgent
from autogen_ext.models.openai import OpenAIChatCompletionClient


async def main() -> None:
    model_client = OpenAIChatCompletionClient(
        model="gpt-4o-mini",
        api_key=None,
    )

    agent = AssistantAgent(
        name="assistant",
        model_client=model_client,
        system_message="You are a concise Python assistant.",
    )

    result = await agent.run(task="Write one sentence about cost tracking in AI agents.")
    print(result.messages[-1].content)

    await model_client.close()


if __name__ == "__main__":
    asyncio.run(main())

•Capture usage from the model client response.
AutoGen’s model client returns usage metadata on the result, which is the cleanest place to compute costs without wrapping every prompt manually.

import asyncio

from autogen_agentchat.agents import AssistantAgent
from autogen_ext.models.openai import OpenAIChatCompletionClient


PRICING_PER_1M_TOKENS = {
    "gpt-4o-mini": {"input": 0.15, "output": 0.60},
}


def estimate_cost(model: str, prompt_tokens: int, completion_tokens: int) -> float:
    pricing = PRICING_PER_1M_TOKENS[model]
    input_cost = (prompt_tokens / 1_000_000) * pricing["input"]
    output_cost = (completion_tokens / 1_000_000) * pricing["output"]
    return input_cost + output_cost


async def main() -> None:
    model_name = "gpt-4o-mini"
    model_client = OpenAIChatCompletionClient(model=model_name)

    agent = AssistantAgent(name="assistant", model_client=model_client)
    result = await agent.run(task="Explain token usage in one short paragraph.")

    usage = result.messages[-1].models_usage
    total_cost = estimate_cost(
        model_name,
        prompt_tokens=usage.prompt_tokens,
        completion_tokens=usage.completion_tokens,
    )

    print(f"Prompt tokens: {usage.prompt_tokens}")
    print(f"Completion tokens: {usage.completion_tokens}")
    print(f"Estimated cost: ${total_cost:.6f}")

    await model_client.close()


if __name__ == "__main__":
    asyncio.run(main())

•Wrap the tracking logic so every run prints the same metrics.
In production, don’t scatter cost math across your codebase. Put it behind one helper so you can update pricing tables or add logging later without touching agent logic.

import asyncio
from dataclasses import dataclass

from autogen_agentchat.agents import AssistantAgent
from autogen_ext.models.openai import OpenAIChatCompletionClient


@dataclass
class CostReport:
    prompt_tokens: int
    completion_tokens: int
    estimated_cost_usd: float


def estimate_cost_usd(prompt_tokens: int, completion_tokens: int) -> float:
    input_rate = 0.15 / 1_000_000
    output_rate = 0.60 / 1_000_000
    return (prompt_tokens * input_rate) + (completion_tokens * output_rate)


async def run_with_cost_tracking(agent: AssistantAgent, task: str) -> CostReport:
    result = await agent.run(task=task)
    usage = result.messages[-1].models_usage

    return CostReport(
        prompt_tokens=usage.prompt_tokens,
        completion_tokens=usage.completion_tokens,
        estimated_cost_usd=estimate_cost_usd(usage.prompt_tokens, usage.completion_tokens),
    )


async def main() -> None:
    client = OpenAIChatCompletionClient(model="gpt-4o-mini")
    agent = AssistantAgent(name="assistant", model_client=client)

    report = await run_with_cost_tracking(agent, "Summarize why token tracking matters.")
    print(report)

    await client.close()


if __name__ == "__main__":
    asyncio.run(main())

•Add structured logging for real observability.
Printing is fine for local testing, but production teams want JSON logs they can ship to CloudWatch, Datadog, or ELK.

import asyncio
import json
import logging

from autogen_agentchat.agents import AssistantAgent
from autogen_ext.models.openai import OpenAIChatCompletionClient


logging.basicConfig(level=logging.INFO)
logger = logging.getLogger("cost-tracker")


async def main() -> None:
    client = OpenAIChatCompletionClient(model="gpt-4o-mini")
    agent = AssistantAgent(name="assistant", model_client=client)

    result = await agent.run(task="List three benefits of cost tracking.")
    
    usage = result.messages[-1].models_usage
      # noqa: E501
      # Keep this line aligned with your logger format in production.

•Track cumulative spend across multiple runs.
Once you have single-run numbers working, aggregate them by session, user, workflow step, or tenant so finance and platform teams can see where spend comes from.

import asyncio

from autogen_agentchat.agents import AssistantAgent
from autogen_ext.models.openai import OpenAIChatCompletionClient


def estimate(prompt_tokens: int, completion_tokens: int) -> float:
    return (prompt_tokens * 0.15 / 1_000_000) + (completion_tokens * 0.60 / 1_000_000)


async def main() -> None:
    total_cost = 0.0
    client = OpenAIChatCompletionClient(model="gpt-4o-mini")

---

## Keep learning

- [The complete AI Agents Roadmap](/blog/ai-agents-roadmap-2026) — my full 8-step breakdown
- [Free: The AI Agent Starter Kit](/starter-kit) — PDF checklist + starter code
- [Work with me](/contact) — I build AI for banks and insurance companies

*By Cyprian Aarons, AI Consultant at [Topiax](https://topiax.xyz).*

ShareX / Twitter LinkedIn

Want the complete 8-step roadmap?

Grab the free AI Agent Starter Kit — architecture templates, compliance checklists, and a 7-email deep-dive course.

Get the Starter Kit

AutoGen Tutorial (Python): adding cost tracking for intermediate developers

What You'll Need

Step-by-Step

Want the complete 8-step roadmap?

Related Guides