AutoGen Tutorial (Python): adding cost tracking for intermediate developers
By Cyprian AaronsUpdated 2026-04-21
autogenadding-cost-tracking-for-intermediate-developerspython
This tutorial shows you how to add per-run cost tracking to an AutoGen Python setup, so you can see how much each agent conversation costs in tokens and dollars. You need this when you move past prototypes and want basic spend visibility for debugging, budgeting, or routing expensive tasks to cheaper models.
What You'll Need
- •Python 3.10+
- •
autogen-agentchatandautogen-ext - •An OpenAI API key set as an environment variable
- •A model that supports usage reporting, such as
gpt-4o-mini - •Basic familiarity with AutoGen agents and the
run()API
Install the packages:
pip install autogen-agentchat autogen-ext openai
Set your API key:
export OPENAI_API_KEY="your-key-here"
Step-by-Step
- •Start with a minimal AutoGen assistant agent.
We’ll use a single assistant agent first so the cost tracking is easy to validate before adding more agents or tools.
import asyncio
from autogen_agentchat.agents import AssistantAgent
from autogen_ext.models.openai import OpenAIChatCompletionClient
async def main() -> None:
model_client = OpenAIChatCompletionClient(
model="gpt-4o-mini",
api_key=None,
)
agent = AssistantAgent(
name="assistant",
model_client=model_client,
system_message="You are a concise Python assistant.",
)
result = await agent.run(task="Write one sentence about cost tracking in AI agents.")
print(result.messages[-1].content)
await model_client.close()
if __name__ == "__main__":
asyncio.run(main())
- •Capture usage from the model client response.
AutoGen’s model client returns usage metadata on the result, which is the cleanest place to compute costs without wrapping every prompt manually.
import asyncio
from autogen_agentchat.agents import AssistantAgent
from autogen_ext.models.openai import OpenAIChatCompletionClient
PRICING_PER_1M_TOKENS = {
"gpt-4o-mini": {"input": 0.15, "output": 0.60},
}
def estimate_cost(model: str, prompt_tokens: int, completion_tokens: int) -> float:
pricing = PRICING_PER_1M_TOKENS[model]
input_cost = (prompt_tokens / 1_000_000) * pricing["input"]
output_cost = (completion_tokens / 1_000_000) * pricing["output"]
return input_cost + output_cost
async def main() -> None:
model_name = "gpt-4o-mini"
model_client = OpenAIChatCompletionClient(model=model_name)
agent = AssistantAgent(name="assistant", model_client=model_client)
result = await agent.run(task="Explain token usage in one short paragraph.")
usage = result.messages[-1].models_usage
total_cost = estimate_cost(
model_name,
prompt_tokens=usage.prompt_tokens,
completion_tokens=usage.completion_tokens,
)
print(f"Prompt tokens: {usage.prompt_tokens}")
print(f"Completion tokens: {usage.completion_tokens}")
print(f"Estimated cost: ${total_cost:.6f}")
await model_client.close()
if __name__ == "__main__":
asyncio.run(main())
- •Wrap the tracking logic so every run prints the same metrics.
In production, don’t scatter cost math across your codebase. Put it behind one helper so you can update pricing tables or add logging later without touching agent logic.
import asyncio
from dataclasses import dataclass
from autogen_agentchat.agents import AssistantAgent
from autogen_ext.models.openai import OpenAIChatCompletionClient
@dataclass
class CostReport:
prompt_tokens: int
completion_tokens: int
estimated_cost_usd: float
def estimate_cost_usd(prompt_tokens: int, completion_tokens: int) -> float:
input_rate = 0.15 / 1_000_000
output_rate = 0.60 / 1_000_000
return (prompt_tokens * input_rate) + (completion_tokens * output_rate)
async def run_with_cost_tracking(agent: AssistantAgent, task: str) -> CostReport:
result = await agent.run(task=task)
usage = result.messages[-1].models_usage
return CostReport(
prompt_tokens=usage.prompt_tokens,
completion_tokens=usage.completion_tokens,
estimated_cost_usd=estimate_cost_usd(usage.prompt_tokens, usage.completion_tokens),
)
async def main() -> None:
client = OpenAIChatCompletionClient(model="gpt-4o-mini")
agent = AssistantAgent(name="assistant", model_client=client)
report = await run_with_cost_tracking(agent, "Summarize why token tracking matters.")
print(report)
await client.close()
if __name__ == "__main__":
asyncio.run(main())
- •Add structured logging for real observability.
Printing is fine for local testing, but production teams want JSON logs they can ship to CloudWatch, Datadog, or ELK.
import asyncio
import json
import logging
from autogen_agentchat.agents import AssistantAgent
from autogen_ext.models.openai import OpenAIChatCompletionClient
logging.basicConfig(level=logging.INFO)
logger = logging.getLogger("cost-tracker")
async def main() -> None:
client = OpenAIChatCompletionClient(model="gpt-4o-mini")
agent = AssistantAgent(name="assistant", model_client=client)
result = await agent.run(task="List three benefits of cost tracking.")
usage = result.messages[-1].models_usage
# noqa: E501
# Keep this line aligned with your logger format in production.
- •Track cumulative spend across multiple runs.
Once you have single-run numbers working, aggregate them by session, user, workflow step, or tenant so finance and platform teams can see where spend comes from.
import asyncio
from autogen_agentchat.agents import AssistantAgent
from autogen_ext.models.openai import OpenAIChatCompletionClient
def estimate(prompt_tokens: int, completion_tokens: int) -> float:
return (prompt_tokens * 0.15 / 1_000_000) + (completion_tokens * 0.60 / 1_000_000)
async def main() -> None:
total_cost = 0.0
client = OpenAIChatCompletionClient(model="gpt-4o-mini")
---
## Keep learning
- [The complete AI Agents Roadmap](/blog/ai-agents-roadmap-2026) — my full 8-step breakdown
- [Free: The AI Agent Starter Kit](/starter-kit) — PDF checklist + starter code
- [Work with me](/contact) — I build AI for banks and insurance companies
*By Cyprian Aarons, AI Consultant at [Topiax](https://topiax.xyz).*
Want the complete 8-step roadmap?
Grab the free AI Agent Starter Kit — architecture templates, compliance checklists, and a 7-email deep-dive course.
Get the Starter Kit