AutoGen Tutorial (TypeScript): mocking LLM calls in tests for advanced developers
This tutorial shows how to make AutoGen-based TypeScript tests deterministic by mocking LLM calls at the boundary, instead of hitting a real model on every run. You need this when you want fast CI, stable assertions, and tests that validate orchestration logic rather than model behavior.
What You'll Need
- •Node.js 18+
- •TypeScript 5+
- •
npmorpnpm - •
autogeninstalled in your project - •A test runner like
vitestorjest - •An OpenAI API key only if you want to run the unmocked version locally
- •A basic AutoGen setup with at least one agent using an OpenAI-compatible model client
Step-by-Step
- •Start by extracting LLM access behind a small function. That gives you one place to swap real model calls for a mock during tests.
// llm.ts
import { AssistantAgent } from "autogen";
export async function askAgent(
agent: AssistantAgent,
message: string,
): Promise<string> {
const result = await agent.run({
task: message,
});
return result.messages
.filter((m) => m.source === "assistant")
.map((m) => m.content)
.join("\n");
}
- •Create the agent in a factory so your production code and test code can share the same shape. The important part is that the model client is injectable, not hard-coded.
// agent-factory.ts
import { AssistantAgent } from "autogen";
import { OpenAIChatCompletionClient } from "autogen";
export function createAssistant(modelClient?: OpenAIChatCompletionClient) {
return new AssistantAgent({
name: "support_agent",
modelClient:
modelClient ??
new OpenAIChatCompletionClient({
model: "gpt-4o-mini",
apiKey: process.env.OPENAI_API_KEY!,
}),
systemMessage: "You are a support assistant.",
});
}
- •Build a mock model client for tests. In AutoGen, the cleanest pattern is to implement the same interface shape your agent expects and return fixed messages for known prompts.
// mock-model-client.ts
export class MockModelClient {
async create(messages: Array<{ role: string; content: string }>) {
const last = messages[messages.length - 1]?.content ?? "";
if (last.includes("refund")) {
return {
content: "I can help with refunds. Please provide your order ID.",
};
}
return {
content: "I need more details to continue.",
};
}
}
- •Wire the mock into your agent factory in the test file. This keeps the test focused on behavior: given a prompt, does your orchestration produce the expected response?
// llm.test.ts
import { describe, expect, it } from "vitest";
import { createAssistant } from "./agent-factory";
import { askAgent } from "./llm";
import { MockModelClient } from "./mock-model-client";
describe("AutoGen agent with mocked LLM", () => {
it("returns a refund-specific response", async () => {
const agent = createAssistant(new MockModelClient() as never);
const answer = await askAgent(agent, "I want a refund for my order");
expect(answer).toContain("refunds");
expect(answer).toContain("order ID");
});
it("returns the fallback response for unknown prompts", async () => {
const agent = createAssistant(new MockModelClient() as never);
const answer = await askAgent(agent, "Hello");
expect(answer).toContain("more details");
});
});
- •If you need stricter tests, assert on exact output and not just substrings. That catches accidental prompt changes and makes regressions obvious when someone edits system instructions.
// strict.test.ts
import { describe, expect, it } from "vitest";
import { createAssistant } from "./agent-factory";
import { askAgent } from "./llm";
import { MockModelClient } from "./mock-model-client";
describe("strict output checks", () => {
it("matches the exact mocked response", async () => {
const agent = createAssistant(new MockModelClient() as never);
const answer = await askAgent(agent, "refund please");
expect(answer.trim()).toBe(
"I can help with refunds. Please provide your order ID.",
);
});
});
Testing It
Run your test suite with vitest or your runner of choice and confirm no network calls are made. If a test starts hanging or failing because of API access, your mock is not being injected correctly and you’re still hitting a real client somewhere.
The useful check here is repeatability:
- •same input
- •same output
- •no dependency on rate limits or model drift
If you want to verify isolation further, temporarily unset OPENAI_API_KEY and rerun the suite. The mocked tests should still pass because they never touch the live provider.
Next Steps
- •Add snapshot tests for full multi-turn AutoGen conversations.
- •Mock tool calls separately from LLM calls so you can isolate planner vs executor failures.
- •Build contract tests that validate prompt format before you ship changes to production agents.
Keep learning
- •The complete AI Agents Roadmap — my full 8-step breakdown
- •Free: The AI Agent Starter Kit — PDF checklist + starter code
- •Work with me — I build AI for banks and insurance companies
By Cyprian Aarons, AI Consultant at Topiax.
Want the complete 8-step roadmap?
Grab the free AI Agent Starter Kit — architecture templates, compliance checklists, and a 7-email deep-dive course.
Get the Starter Kit