LlamaIndex Tutorial (TypeScript): mocking LLM calls in tests for advanced developers

By Cyprian AaronsUpdated 2026-04-21
llamaindexmocking-llm-calls-in-tests-for-advanced-developerstypescript

This tutorial shows how to make LlamaIndex TypeScript tests deterministic by replacing real LLM calls with mocks. You need this when your agent logic is stable but your tests are flaky, slow, or expensive because they depend on network calls and model variance.

What You'll Need

  • Node.js 18+
  • A TypeScript project with ts-node or a build step
  • llamaindex installed
  • A test runner like vitest or jest
  • If you also want to run live calls outside tests:
    • OPENAI_API_KEY
    • OPENAI_MODEL optional, if you override the default model

Install the packages:

npm install llamaindex
npm install -D vitest typescript tsx @types/node

Step-by-Step

  1. Start by creating a small service that uses LlamaIndex through an injected LLM. The key pattern is dependency injection: your production code accepts an LLM, and your tests swap in a mock implementation.
// src/answerService.ts
import { OpenAI, type LLM } from "llamaindex";

export class AnswerService {
  constructor(private readonly llm: LLM) {}

  async answer(question: string): Promise<string> {
    const response = await this.llm.complete({
      prompt: `Answer in one sentence: ${question}`,
    });

    return response.text.trim();
  }
}

export function createProductionService() {
  return new AnswerService(
    new OpenAI({
      model: process.env.OPENAI_MODEL ?? "gpt-4o-mini",
    })
  );
}
  1. Build a mock LLM that implements the same interface shape your service uses. For this test case, you only need complete, and you can return fixed text based on the prompt content.
// test/mocks.ts
import type { LLM } from "llamaindex";

export class MockLLM implements LLM {
  async complete(input: { prompt: string }) {
    if (input.prompt.includes("capital of France")) {
      return { text: "Paris." };
    }

    return { text: "Mocked answer." };
  }

  // Minimal stubs for methods not used by this tutorial's service.
  // If your code calls more methods, implement them here too.
  chat(): never {
    throw new Error("Not implemented in MockLLM");
  }
}
  1. Write a test that uses the mock instead of the real OpenAI client. This keeps the test offline and stable while still exercising your service logic end to end.
// test/answerService.test.ts
import { describe, expect, it } from "vitest";
import { AnswerService } from "../src/answerService";
import { MockLLM } from "./mocks";

describe("AnswerService", () => {
  it("returns mocked output for a known prompt", async () => {
    const service = new AnswerService(new MockLLM());

    const result = await service.answer("What is the capital of France?");

    expect(result).toBe("Paris.");
  });

  it("returns fallback mocked output for other prompts", async () => {
    const service = new AnswerService(new MockLLM());

    const result = await service.answer("Explain retries");

    expect(result).toBe("Mocked answer.");
  });
});
  1. If your code uses more than complete, mock those methods too. Advanced agent code often calls chat-style APIs or structured outputs, so keep the same injection pattern and add only the surface area you actually use.
// src/chatService.ts
import type { ChatMessage, LLM } from "llamaindex";

export class ChatService {
  constructor(private readonly llm: LLM) {}

  async reply(userText: string): Promise<string> {
    const response = await this.llm.chat({
      messages: [{ role: "user", content: userText } as ChatMessage],
    });

    return response.message.content ?? "";
  }
}
  1. Use a stricter mock when you want to verify prompts, not just outputs. This is useful for compliance-sensitive systems where prompt wording matters, because you can assert on exact request content before returning a canned response.
// test/strictMockLLM.ts
import type { LLM } from "llamaindex";

export class StrictMockLLM implements LLM {
  async complete(input: { prompt: string }) {
    if (!input.prompt.startsWith("Answer in one sentence:")) {
      throw new Error(`Unexpected prompt: ${input.prompt}`);
    }

    return { text: "Approved mocked response." };
  }

  chat(): never {
    throw new Error("Not implemented in StrictMockLLM");
  }
}

Testing It

Run your test suite with vitest:

npx vitest run

You should see the tests pass without any API key set, because nothing reaches OpenAI. If a test fails, check whether your mock matches the exact method signature your production code calls; most issues come from using complete when the code actually uses chat, or vice versa.

A good sanity check is to temporarily replace new MockLLM() with new OpenAI(...) in one local test run and confirm the behavior changes only at the boundary. That tells you your business logic is isolated correctly.

Next Steps

  • Add snapshot tests for tool-call payloads so you can lock down agent behavior across releases.
  • Extend the mock to cover streaming responses if you use token-by-token UI updates.
  • Wrap this pattern into a reusable testing helper for all internal agents and workflows.

Keep learning

By Cyprian Aarons, AI Consultant at Topiax.

Want the complete 8-step roadmap?

Grab the free AI Agent Starter Kit — architecture templates, compliance checklists, and a 7-email deep-dive course.

Get the Starter Kit

Related Guides