AutoGen Tutorial (TypeScript): mocking LLM calls in tests for advanced developers

By Cyprian AaronsUpdated 2026-04-21
autogenmocking-llm-calls-in-tests-for-advanced-developerstypescript

This tutorial shows how to make AutoGen-based TypeScript tests deterministic by mocking LLM calls at the boundary, instead of hitting a real model on every run. You need this when you want fast CI, stable assertions, and tests that validate orchestration logic rather than model behavior.

What You'll Need

  • Node.js 18+
  • TypeScript 5+
  • npm or pnpm
  • autogen installed in your project
  • A test runner like vitest or jest
  • An OpenAI API key only if you want to run the unmocked version locally
  • A basic AutoGen setup with at least one agent using an OpenAI-compatible model client

Step-by-Step

  1. Start by extracting LLM access behind a small function. That gives you one place to swap real model calls for a mock during tests.
// llm.ts
import { AssistantAgent } from "autogen";

export async function askAgent(
  agent: AssistantAgent,
  message: string,
): Promise<string> {
  const result = await agent.run({
    task: message,
  });

  return result.messages
    .filter((m) => m.source === "assistant")
    .map((m) => m.content)
    .join("\n");
}
  1. Create the agent in a factory so your production code and test code can share the same shape. The important part is that the model client is injectable, not hard-coded.
// agent-factory.ts
import { AssistantAgent } from "autogen";
import { OpenAIChatCompletionClient } from "autogen";

export function createAssistant(modelClient?: OpenAIChatCompletionClient) {
  return new AssistantAgent({
    name: "support_agent",
    modelClient:
      modelClient ??
      new OpenAIChatCompletionClient({
        model: "gpt-4o-mini",
        apiKey: process.env.OPENAI_API_KEY!,
      }),
    systemMessage: "You are a support assistant.",
  });
}
  1. Build a mock model client for tests. In AutoGen, the cleanest pattern is to implement the same interface shape your agent expects and return fixed messages for known prompts.
// mock-model-client.ts
export class MockModelClient {
  async create(messages: Array<{ role: string; content: string }>) {
    const last = messages[messages.length - 1]?.content ?? "";

    if (last.includes("refund")) {
      return {
        content: "I can help with refunds. Please provide your order ID.",
      };
    }

    return {
      content: "I need more details to continue.",
    };
  }
}
  1. Wire the mock into your agent factory in the test file. This keeps the test focused on behavior: given a prompt, does your orchestration produce the expected response?
// llm.test.ts
import { describe, expect, it } from "vitest";
import { createAssistant } from "./agent-factory";
import { askAgent } from "./llm";
import { MockModelClient } from "./mock-model-client";

describe("AutoGen agent with mocked LLM", () => {
  it("returns a refund-specific response", async () => {
    const agent = createAssistant(new MockModelClient() as never);
    const answer = await askAgent(agent, "I want a refund for my order");

    expect(answer).toContain("refunds");
    expect(answer).toContain("order ID");
  });

  it("returns the fallback response for unknown prompts", async () => {
    const agent = createAssistant(new MockModelClient() as never);
    const answer = await askAgent(agent, "Hello");

    expect(answer).toContain("more details");
  });
});
  1. If you need stricter tests, assert on exact output and not just substrings. That catches accidental prompt changes and makes regressions obvious when someone edits system instructions.
// strict.test.ts
import { describe, expect, it } from "vitest";
import { createAssistant } from "./agent-factory";
import { askAgent } from "./llm";
import { MockModelClient } from "./mock-model-client";

describe("strict output checks", () => {
  it("matches the exact mocked response", async () => {
    const agent = createAssistant(new MockModelClient() as never);
    const answer = await askAgent(agent, "refund please");

    expect(answer.trim()).toBe(
      "I can help with refunds. Please provide your order ID.",
    );
  });
});

Testing It

Run your test suite with vitest or your runner of choice and confirm no network calls are made. If a test starts hanging or failing because of API access, your mock is not being injected correctly and you’re still hitting a real client somewhere.

The useful check here is repeatability:

  • same input
  • same output
  • no dependency on rate limits or model drift

If you want to verify isolation further, temporarily unset OPENAI_API_KEY and rerun the suite. The mocked tests should still pass because they never touch the live provider.

Next Steps

  • Add snapshot tests for full multi-turn AutoGen conversations.
  • Mock tool calls separately from LLM calls so you can isolate planner vs executor failures.
  • Build contract tests that validate prompt format before you ship changes to production agents.

Keep learning

By Cyprian Aarons, AI Consultant at Topiax.

Want the complete 8-step roadmap?

Grab the free AI Agent Starter Kit — architecture templates, compliance checklists, and a 7-email deep-dive course.

Get the Starter Kit

Related Guides