AutoGen Tutorial (TypeScript): adding observability for intermediate developers
This tutorial shows you how to add practical observability to an AutoGen TypeScript agent setup: logs, spans, and message-level tracing you can inspect when something goes wrong. You need this when your agent starts making multi-step decisions and you can no longer debug it by staring at the final answer.
What You'll Need
- •Node.js 18+ installed
- •A TypeScript project with
ts-nodeor a build step - •These packages:
- •
@autogen/core - •
@autogen/openai - •
dotenv - •
pino
- •
- •An OpenAI API key in
OPENAI_API_KEY - •Optional but useful:
- •An OpenTelemetry backend like Jaeger, Tempo, or Honeycomb
- •A local
.envfile for secrets
Step-by-Step
- •Start with a minimal AutoGen agent and a logger. The goal is to keep the agent code clean while emitting structured events around every request and response.
import "dotenv/config";
import pino from "pino";
import { AssistantAgent } from "@autogen/core";
import { OpenAIChatCompletionClient } from "@autogen/openai";
const logger = pino({ level: "info" });
const modelClient = new OpenAIChatCompletionClient({
model: "gpt-4o-mini",
apiKey: process.env.OPENAI_API_KEY!,
});
const agent = new AssistantAgent({
name: "support-agent",
modelClient,
});
- •Wrap each run with request metadata. In production, you want a correlation ID, user ID, and session ID attached to every log line so you can trace one conversation across services.
const requestContext = {
requestId: crypto.randomUUID(),
userId: "user_123",
sessionId: "session_456",
};
logger.info({ ...requestContext }, "starting agent run");
const result = await agent.run(
[{ role: "user", content: "Summarize our refund policy in one paragraph." }],
);
logger.info(
{
...requestContext,
output: result.messages.at(-1)?.content,
messageCount: result.messages.length,
},
"agent run completed",
);
- •Add message-level observability by logging every inbound and outbound message. This is the part most teams skip, then regret later when they need to know whether the model saw the right prompt or produced a bad intermediate step.
type ChatMessage = {
role: "user" | "assistant" | "system";
content: string;
};
function logMessages(messages: ChatMessage[], requestId: string) {
for (const [index, message] of messages.entries()) {
logger.info(
{
requestId,
index,
role: message.role,
contentPreview: message.content.slice(0, 200),
},
"chat message",
);
}
}
const inputMessages: ChatMessage[] = [
{ role: "system", content: "You are a concise support assistant." },
{ role: "user", content: "Explain our refund policy." },
];
logMessages(inputMessages, requestContext.requestId);
- •Instrument tool calls explicitly if your agent uses tools. Tool execution is where most production failures happen, so log start time, end time, duration, and error details instead of treating tools as a black box.
async function timedTool<T>(name: string, fn: () => Promise<T>) {
const startedAt = Date.now();
logger.info({ name }, "tool started");
try {
const value = await fn();
logger.info({ name, durationMs: Date.now() - startedAt }, "tool completed");
return value;
} catch (error) {
logger.error(
{
name,
durationMs: Date.now() - startedAt,
error,
},
"tool failed",
);
throw error;
}
}
const customerLookup = await timedTool("customerLookup", async () => {
return { id: "cust_1", tier: "gold" };
});
- •Add span-style tracing for the full request lifecycle. Even if you are not wiring a full OpenTelemetry exporter yet, keeping the structure now makes it easy to ship traces later without rewriting your agent flow.
type SpanEvent = {
traceId: string;
spanName: string;
};
async function tracedRun(spanEvent: SpanEvent) {
logger.info({ ...spanEvent }, "span started");
const response = await agent.run([
{ role: "user", content: "Write a short refund policy summary." },
]);
logger.info(
{
...spanEvent,
outputPreview: response.messages.at(-1)?.content?.slice(0, 200),
},
"span finished",
);
return response;
}
await tracedRun({
traceId: requestContext.requestId,
spanName: "agent.run",
});
- •Put it together in one executable script and run it locally. The important part is that every meaningful event has structured context attached so you can filter by request ID in logs or forward the same fields into traces later.
import crypto from "node:crypto";
import "dotenv/config";
import pino from "pino";
import { AssistantAgent } from "@autogen/core";
import { OpenAIChatCompletionClient } from "@autogen/openai";
const logger = pino({ level: "info" });
const modelClient = new OpenAIChatCompletionClient({
model: "gpt-4o-mini",
apiKey: process.env.OPENAI_API_KEY!,
});
const agent = new AssistantAgent({
name: "support-agent",
});
(agent as any).modelClient = modelClient;
const requestId = crypto.randomUUID();
logger.info({ requestId }, "starting");
const result = await agent.run([
`.trim()
Testing It
Run the script with OPENAI_API_KEY set and confirm you see structured JSON logs for start, messages, tool execution, and completion. If you only see the final answer, your instrumentation is too shallow.
Check that every log line contains the same requestId. That is what lets you reconstruct one conversation across retries, tool calls, and downstream service calls.
If you have multiple concurrent requests, fire two runs at once and make sure their logs stay separated by correlation ID. If they do not, fix that before adding more agents or tools.
Next Steps
- •Add OpenTelemetry spans and export them to Jaeger or Honeycomb
- •Wrap AutoGen tool functions with retry metrics and failure counters
- •Store conversation transcripts in object storage for replayable debugging
Keep learning
- •The complete AI Agents Roadmap — my full 8-step breakdown
- •Free: The AI Agent Starter Kit — PDF checklist + starter code
- •Work with me — I build AI for banks and insurance companies
By Cyprian Aarons, AI Consultant at Topiax.
Want the complete 8-step roadmap?
Grab the free AI Agent Starter Kit — architecture templates, compliance checklists, and a 7-email deep-dive course.
Get the Starter Kit