CrewAI Tutorial (TypeScript): mocking LLM calls in tests for beginners

By Cyprian AaronsUpdated 2026-04-21

crewaimocking-llm-calls-in-tests-for-beginnerstypescript

This tutorial shows how to test a CrewAI TypeScript workflow without making real LLM calls. You’ll replace the model with a deterministic mock so your tests are fast, cheap, and stable when you’re validating prompts, task wiring, and output parsing.

What You'll Need

•Node.js 18+ installed
•A TypeScript project with typescript, ts-node, and a test runner like vitest
•CrewAI for TypeScript installed in your project
•An API key only if you want to compare mocked tests against real runs later
•
Basic familiarity with:
- •Agent
- •Task
- •Crew
- •async/await in TypeScript

Step-by-Step

•Set up a small CrewAI workflow that we can test. Keep it minimal: one agent, one task, one crew. The point is to isolate the LLM boundary so the test only checks your orchestration logic.

// src/crew.ts
import { Agent, Crew, Task } from "crewai";

export function buildSupportCrew() {
  const supportAgent = new Agent({
    name: "Support Analyst",
    role: "Customer support triage",
    goal: "Classify support tickets into billing, login, or technical issues",
    backstory: "You are precise and concise.",
  });

  const classifyTicket = new Task({
    description:
      "Classify this ticket: 'I was charged twice for my subscription.' Return only the category.",
    expectedOutput: "One word category",
    agent: supportAgent,
  });

  return new Crew({
    agents: [supportAgent],
    tasks: [classifyTicket],
  });
}

•Add a mock LLM implementation. In tests, you want a predictable response every time. The cleanest approach is to inject a fake model object that matches the interface your CrewAI setup expects.

// tests/mockLlm.ts
export class MockLlm {
  async call(prompt: string): Promise<string> {
    if (prompt.includes("charged twice")) {
      return "billing";
    }

    return "technical";
  }
}

•Wire the mock into your crew factory for test mode. This keeps production code using the real provider while tests swap in the fake. If your version of CrewAI exposes an LLM config on the agent or crew, use that seam instead of patching internals.

// src/crew.testable.ts
import { Agent, Crew, Task } from "crewai";
import { MockLlm } from "../tests/mockLlm";

export function buildSupportCrewForTest() {
  const llm = new MockLlm();

  const supportAgent = new Agent({
    name: "Support Analyst",
    role: "Customer support triage",
    goal: "Classify support tickets into billing, login, or technical issues",
    backstory: "You are precise and concise.",
    llm,
  });

  const classifyTicket = new Task({
    description:
      "Classify this ticket: 'I was charged twice for my subscription.' Return only the category.",
    expectedOutput: "One word category",
    agent: supportAgent,
  });

  return new Crew({
    agents: [supportAgent],
    tasks: [classifyTicket],
  });
}

•Write a test that asserts the crew returns the mocked output. This verifies your task setup without requiring network access or token usage. Use a real assertion on the final result so you catch prompt drift and wiring mistakes.

// tests/crew.test.ts
import { describe, expect, it } from "vitest";
import { buildSupportCrewForTest } from "../src/crew.testable";

describe("support crew", () => {
  it("returns billing for a charge dispute", async () => {
    const crew = buildSupportCrewForTest();
    const result = await crew.run();

    expect(String(result).toLowerCase()).toContain("billing");
  });
});

•Add one more test for a different branch so you know your mock behaves deterministically across scenarios. This is where beginners usually stop too early and miss regressions in prompt routing or parsing logic.

// tests/mockLlm.test.ts
import { describe, expect, it } from "vitest";
import { MockLlm } from "./mockLlm";

describe("MockLlm", () => {
  it("returns billing when ticket mentions duplicate charges", async () => {
    const llm = new MockLlm();
    const output = await llm.call("I was charged twice for my subscription.");

    expect(output).toBe("billing");
  });

  it("returns technical for other prompts", async () => {
    const llm = new MockLlm();
    const output = await llm.call("The app crashes on startup.");

    expect(output).toBe("technical");
  });
});

Testing It

Run your tests with vitest run or your configured test command. You should see deterministic output every time because nothing is calling an external model.

If the crew test fails, check three things first:

•Your agent accepts an injected llm
•Your mock returns plain strings in the format your task expects
•Your assertions match what CrewAI actually returns from crew.run()

If you want to validate production behavior later, keep this mock-based suite and add a separate integration test file that uses a real API key behind an environment flag.

Next Steps

•Add snapshot tests for structured outputs like JSON classification results
•Build an adapter layer so all LLM providers implement one internal interface
•Add contract tests that verify prompts still produce valid categories after prompt edits

Keep learning

•The complete AI Agents Roadmap — my full 8-step breakdown
•Free: The AI Agent Starter Kit — PDF checklist + starter code
•Work with me — I build AI for banks and insurance companies

By Cyprian Aarons, AI Consultant at Topiax.

ShareX / Twitter LinkedIn

Want the complete 8-step roadmap?

Grab the free AI Agent Starter Kit — architecture templates, compliance checklists, and a 7-email deep-dive course.

Get the Starter Kit