CrewAI Tutorial (TypeScript): mocking LLM calls in tests for advanced developers

By Cyprian AaronsUpdated 2026-04-21

crewaimocking-llm-calls-in-tests-for-advanced-developerstypescript

This tutorial shows how to test a CrewAI TypeScript workflow without calling a real model. You’ll mock the LLM boundary so your tests are fast, deterministic, and safe to run in CI.

What You'll Need

•Node.js 18+
•TypeScript 5+
•crewai installed in your project
•A test runner: vitest or jest
•dotenv if you load env vars locally
•No OpenAI key is required for the mocked tests, but keep one available for integration tests
•A CrewAI project with at least one agent and one task

Step-by-Step

•Start with a small crew that depends on an injected LLM. The important part is not hardcoding model access inside your business logic; tests need a seam you can replace.

import { Agent, Task, Crew, LLM } from "crewai";

export function buildCrew(llm: LLM) {
  const analyst = new Agent({
    role: "Claims analyst",
    goal: "Summarize claim risk clearly",
    backstory: "You review insurance claims and produce concise assessments.",
    llm,
  });

  const task = new Task({
    description: "Summarize this claim: water damage in kitchen after pipe burst.",
    expectedOutput: "A short risk summary with recommendation.",
    agent: analyst,
  });

  return new Crew({
    agents: [analyst],
    tasks: [task],
  });
}

•Create a fake LLM for tests by mocking the method CrewAI calls when it needs text generation. In practice, this keeps your test focused on orchestration and prompt wiring instead of network behavior.

import { LLM } from "crewai";

export class MockLLM extends LLM {
  constructor() {
    super({ model: "mock/model" });
  }

  async call(prompt: string): Promise<string> {
    if (prompt.includes("water damage")) {
      return JSON.stringify({
        summary: "Moderate severity water damage claim.",
        recommendation: "Request photos and plumbing report.",
      });
    }

    return JSON.stringify({
      summary: "Default mock response.",
      recommendation: "Review manually.",
    });
  }
}

•Write a unit test that injects the mock and asserts on the crew output. Keep the assertion stable by checking for known substrings or parsed JSON rather than exact token-for-token output.

import { describe, it, expect } from "vitest";
import { buildCrew } from "./buildCrew";
import { MockLLM } from "./MockLLM";

describe("crew with mocked llm", () => {
  it("returns deterministic output", async () => {
    const crew = buildCrew(new MockLLM());
    const result = await crew.kickoff();

    expect(String(result)).toContain("water damage");
    expect(String(result)).toContain("Request photos");
  });
});

•If your code uses environment-driven model selection, isolate that behind a factory. Your production path can use a real provider while tests swap in the mock without changing the rest of the system.

import { LLM } from "crewai";
import { MockLLM } from "./MockLLM";

export function createLLM() {
  if (process.env.NODE_ENV === "test") {
    return new MockLLM();
  }

  return new LLM({
    model: process.env.CREWAI_MODEL ?? "gpt-4o-mini",
    apiKey: process.env.OPENAI_API_KEY,
  });
}

•If you need stronger guarantees, assert that the right prompt reached the mock. This catches regressions where someone changes task wording or stops passing critical context into the agent.

import { LLM } from "crewai";

export class RecordingMockLLM extends LLM {
  public prompts: string[] = [];

  constructor() {
    super({ model: "mock/model" });
  }

  async call(prompt: string): Promise<string> {
    this.prompts.push(prompt);
    return '{"summary":"ok","recommendation":"continue"}';
  }
}

Testing It

Run your test suite with NODE_ENV=test so your factory returns the mock implementation. The test should complete without any network calls, and it should pass consistently on every run.

If you want to verify the seam is working, temporarily make the mock return an obviously wrong payload and confirm the test fails for the right reason. That tells you your assertions are actually checking behavior instead of just waiting for a truthy result.

For deeper validation, add one separate integration test file that uses a real API key and hits an actual provider. Keep that out of normal CI if cost or rate limits matter.

Next Steps

•Add snapshot-style assertions around structured outputs from tasks that return JSON.
•Wrap crew creation in a service layer so agents, tools, and models are all injectable.
•Add contract tests for tool-calling flows so you can mock both LLM responses and tool outputs separately.

Keep learning

•The complete AI Agents Roadmap — my full 8-step breakdown
•Free: The AI Agent Starter Kit — PDF checklist + starter code
•Work with me — I build AI for banks and insurance companies

By Cyprian Aarons, AI Consultant at Topiax.

ShareX / Twitter LinkedIn

Want the complete 8-step roadmap?

Grab the free AI Agent Starter Kit — architecture templates, compliance checklists, and a 7-email deep-dive course.

Get the Starter Kit