CrewAI Tutorial (TypeScript): mocking LLM calls in tests for intermediate developers

By Cyprian AaronsUpdated 2026-04-21
crewaimocking-llm-calls-in-tests-for-intermediate-developerstypescript

This tutorial shows you how to write deterministic tests for a CrewAI TypeScript agent by mocking LLM calls at the boundary where your code talks to the model. You need this when you want fast CI, stable snapshots, and unit tests that fail because your code changed — not because the model answered differently.

What You'll Need

  • Node.js 18+ and a TypeScript project
  • crewai installed in your app
  • A test runner such as vitest
  • typescript and ts-node or a normal TS build pipeline
  • An API key for the LLM provider you use in production, if you want to run the non-mocked path
  • Basic familiarity with CrewAI concepts like Agent, Task, and Crew

Step-by-Step

  1. Start with a tiny CrewAI setup that uses an injected model client instead of hardcoding network calls. The key idea is to wrap the LLM dependency behind an interface so your tests can swap in a fake implementation.
import { Agent, Crew, Task } from "crewai";

export interface LlmClient {
  complete(prompt: string): Promise<string>;
}

export class OpenAiLikeClient implements LlmClient {
  constructor(private readonly apiKey: string) {}

  async complete(prompt: string): Promise<string> {
    const response = await fetch("https://api.openai.com/v1/chat/completions", {
      method: "POST",
      headers: {
        "Content-Type": "application/json",
        Authorization: `Bearer ${this.apiKey}`,
      },
      body: JSON.stringify({
        model: "gpt-4o-mini",
        messages: [{ role: "user", content: prompt }],
      }),
    });

    const json = await response.json();
    return json.choices[0].message.content as string;
  }
}
  1. Build your business function around that client. This keeps CrewAI orchestration intact while making the actual model call replaceable in tests.
import { Agent, Crew, Task } from "crewai";
import { LlmClient } from "./llm-client";

export async function summarizeClaim(notes: string, llm: LlmClient) {
  const agent = new Agent({
    role: "Claims Analyst",
    goal: "Summarize claim notes clearly",
    backstory: "You work in insurance operations.",
  });

  const task = new Task({
    description: `Summarize these claim notes:\n${notes}`,
    expectedOutput: "A concise summary with action items.",
    agent,
  });

  const crew = new Crew({
    agents: [agent],
    tasks: [task],
  });

  const prompt = `${task.description}\n\nReturn only the summary.`;
  const summary = await llm.complete(prompt);

  return {
    crew,
    summary,
  };
}
  1. Create a mock client for tests. This is where you control outputs, assert prompts, and avoid real API traffic.
import { LlmClient } from "../src/llm-client";

export class MockLlmClient implements LlmClient {
  public prompts: string[] = [];

  constructor(private readonly responses: string[]) {}

  async complete(prompt: string): Promise<string> {
    this.prompts.push(prompt);
    const next = this.responses.shift();
    if (!next) throw new Error("No mock response left");
    return next;
  }
}
  1. Write a test that proves your orchestration code calls the model with the right prompt and handles the response correctly. Use exact assertions on the prompt so regressions show up immediately.
import { describe, expect, it } from "vitest";
import { summarizeClaim } from "../src/summarize-claim";
import { MockLlmClient } from "./mock-llm-client";

describe("summarizeClaim", () => {
  it("uses the injected LLM client instead of calling the network", async () => {
    const llm = new MockLlmClient(["Customer reported water damage. Action: request photos."]);

    const result = await summarizeClaim(
      "Customer called about water leak in kitchen after storm.",
      llm
    );

    expect(result.summary).toContain("water damage");
    expect(llm.prompts).toHaveLength(1);
    expect(llm.prompts[0]).toContain("Customer called about water leak in kitchen after storm.");
  });
});
  1. If you want to test multiple paths, queue responses in order. This is useful when your code makes more than one model call or when you want to simulate retries and malformed output.
import { describe, expect, it } from "vitest";
import { MockLlmClient } from "./mock-llm-client";

describe("MockLlmClient", () => {
  it("returns responses in sequence", async () => {
    const llm = new MockLlmClient(["first", "second"]);

    const a = await llm.complete("prompt-a");
    const b = await llm.complete("prompt-b");

    expect(a).toBe("first");
    expect(b).toBe("second");
    expect(llm.prompts).toEqual(["prompt-a", "prompt-b"]);
  });
});

Testing It

Run your test suite with vitest run and confirm that no test tries to hit the external API. If you see network requests during unit tests, you still have a direct dependency somewhere that needs to be injected or wrapped.

A good sanity check is to temporarily disconnect from the internet or unset your API key and rerun the suite. The mocked tests should still pass, which tells you your unit layer is isolated correctly.

If you also keep one integration test with a real key, mark it separately so CI can skip it by default. That gives you coverage for both prompt logic and provider wiring without making every build depend on live inference.

Next Steps

  • Add snapshot testing for prompts so changes to task wording are reviewed intentionally
  • Split unit tests for orchestration from integration tests for provider adapters
  • Add retry logic tests by returning invalid JSON or empty strings from MockLlmClient

Keep learning

By Cyprian Aarons, AI Consultant at Topiax.

Want the complete 8-step roadmap?

Grab the free AI Agent Starter Kit — architecture templates, compliance checklists, and a 7-email deep-dive course.

Get the Starter Kit

Related Guides