AutoGen Tutorial (TypeScript): mocking LLM calls in tests for advanced developers

By Cyprian AaronsUpdated 2026-04-21

autogenmocking-llm-calls-in-tests-for-advanced-developerstypescript

This tutorial shows how to make AutoGen-based TypeScript tests deterministic by mocking LLM calls at the boundary, instead of hitting a real model on every run. You need this when you want fast CI, stable assertions, and tests that validate orchestration logic rather than model behavior.

What You'll Need

•Node.js 18+
•TypeScript 5+
•npm or pnpm
•autogen installed in your project
•A test runner like vitest or jest
•An OpenAI API key only if you want to run the unmocked version locally
•A basic AutoGen setup with at least one agent using an OpenAI-compatible model client

Step-by-Step

•Start by extracting LLM access behind a small function. That gives you one place to swap real model calls for a mock during tests.

// llm.ts
import { AssistantAgent } from "autogen";

export async function askAgent(
  agent: AssistantAgent,
  message: string,
): Promise<string> {
  const result = await agent.run({
    task: message,
  });

  return result.messages
    .filter((m) => m.source === "assistant")
    .map((m) => m.content)
    .join("\n");
}

•Create the agent in a factory so your production code and test code can share the same shape. The important part is that the model client is injectable, not hard-coded.

// agent-factory.ts
import { AssistantAgent } from "autogen";
import { OpenAIChatCompletionClient } from "autogen";

export function createAssistant(modelClient?: OpenAIChatCompletionClient) {
  return new AssistantAgent({
    name: "support_agent",
    modelClient:
      modelClient ??
      new OpenAIChatCompletionClient({
        model: "gpt-4o-mini",
        apiKey: process.env.OPENAI_API_KEY!,
      }),
    systemMessage: "You are a support assistant.",
  });
}

•Build a mock model client for tests. In AutoGen, the cleanest pattern is to implement the same interface shape your agent expects and return fixed messages for known prompts.

// mock-model-client.ts
export class MockModelClient {
  async create(messages: Array<{ role: string; content: string }>) {
    const last = messages[messages.length - 1]?.content ?? "";

    if (last.includes("refund")) {
      return {
        content: "I can help with refunds. Please provide your order ID.",
      };
    }

    return {
      content: "I need more details to continue.",
    };
  }
}

•Wire the mock into your agent factory in the test file. This keeps the test focused on behavior: given a prompt, does your orchestration produce the expected response?

// llm.test.ts
import { describe, expect, it } from "vitest";
import { createAssistant } from "./agent-factory";
import { askAgent } from "./llm";
import { MockModelClient } from "./mock-model-client";

describe("AutoGen agent with mocked LLM", () => {
  it("returns a refund-specific response", async () => {
    const agent = createAssistant(new MockModelClient() as never);
    const answer = await askAgent(agent, "I want a refund for my order");

    expect(answer).toContain("refunds");
    expect(answer).toContain("order ID");
  });

  it("returns the fallback response for unknown prompts", async () => {
    const agent = createAssistant(new MockModelClient() as never);
    const answer = await askAgent(agent, "Hello");

    expect(answer).toContain("more details");
  });
});

•If you need stricter tests, assert on exact output and not just substrings. That catches accidental prompt changes and makes regressions obvious when someone edits system instructions.

// strict.test.ts
import { describe, expect, it } from "vitest";
import { createAssistant } from "./agent-factory";
import { askAgent } from "./llm";
import { MockModelClient } from "./mock-model-client";

describe("strict output checks", () => {
  it("matches the exact mocked response", async () => {
    const agent = createAssistant(new MockModelClient() as never);
    const answer = await askAgent(agent, "refund please");

    expect(answer.trim()).toBe(
      "I can help with refunds. Please provide your order ID.",
    );
  });
});

Testing It

Run your test suite with vitest or your runner of choice and confirm no network calls are made. If a test starts hanging or failing because of API access, your mock is not being injected correctly and you’re still hitting a real client somewhere.

The useful check here is repeatability:

•same input
•same output
•no dependency on rate limits or model drift

If you want to verify isolation further, temporarily unset OPENAI_API_KEY and rerun the suite. The mocked tests should still pass because they never touch the live provider.

Next Steps

•Add snapshot tests for full multi-turn AutoGen conversations.
•Mock tool calls separately from LLM calls so you can isolate planner vs executor failures.
•Build contract tests that validate prompt format before you ship changes to production agents.

Keep learning

•The complete AI Agents Roadmap — my full 8-step breakdown
•Free: The AI Agent Starter Kit — PDF checklist + starter code
•Work with me — I build AI for banks and insurance companies

By Cyprian Aarons, AI Consultant at Topiax.

ShareX / Twitter LinkedIn

Want the complete 8-step roadmap?

Grab the free AI Agent Starter Kit — architecture templates, compliance checklists, and a 7-email deep-dive course.

Get the Starter Kit