LlamaIndex Tutorial (TypeScript): mocking LLM calls in tests for beginners

By Cyprian AaronsUpdated 2026-04-21
llamaindexmocking-llm-calls-in-tests-for-beginnerstypescript

This tutorial shows you how to write TypeScript tests for LlamaIndex code without calling a real model. You need this when your unit tests should be fast, deterministic, and not burn API credits every time CI runs.

What You'll Need

  • Node.js 18+
  • A TypeScript project with ts-jest, vitest, or jest
  • llamaindex installed
  • A test runner set up in your repo
  • No API key required for the mocked tests in this tutorial
  • Optional: an OpenAI API key if you want to compare mocked vs real behavior later

Step-by-Step

  1. Install the LlamaIndex package and a test runner if you do not already have them. The important part is that your app code uses the same query path in production and tests.
npm install llamaindex
npm install -D vitest typescript @types/node
  1. Create a small wrapper around your LlamaIndex call. This keeps the test surface narrow, which makes mocking much easier than trying to mock your entire app.
// src/qa.ts
import { Document, VectorStoreIndex } from "llamaindex";

export async function answerQuestion(question: string): Promise<string> {
  const docs = [
    new Document({ text: "LlamaIndex is used to build RAG applications." }),
    new Document({ text: "Unit tests should not depend on live LLM calls." }),
  ];

  const index = await VectorStoreIndex.fromDocuments(docs);
  const queryEngine = index.asQueryEngine();

  const response = await queryEngine.query({ query: question });
  return response.toString();
}
  1. Mock the query engine instead of mocking internals deep inside LlamaIndex. For beginner-friendly tests, this is the cleanest approach because you control the output and still exercise your own business logic.
// src/qa.test.ts
import { describe, expect, it, vi } from "vitest";
import * as qa from "./qa";

describe("answerQuestion", () => {
  it("returns a predictable answer when the query engine is mocked", async () => {
    vi.spyOn(qa, "answerQuestion").mockResolvedValue("Mocked answer");

    const result = await qa.answerQuestion("What is LlamaIndex?");
    expect(result).toBe("Mocked answer");
  });
});
  1. If you want to test code that directly uses a query engine object, mock the method your code actually calls. This pattern is better than stubbing network requests because it stays close to the contract your application depends on.
// src/service.ts
export interface QueryEngineLike {
  query(input: { query: string }): Promise<{ toString(): string }>;
}

export async function getAnswer(
  engine: QueryEngineLike,
  question: string,
): Promise<string> {
  const response = await engine.query({ query: question });
  return response.toString();
}
// src/service.test.ts
import { describe, expect, it, vi } from "vitest";
import { getAnswer, QueryEngineLike } from "./service";

describe("getAnswer", () => {
  it("uses a mocked engine response", async () => {
    const engine: QueryEngineLike = {
      query: vi.fn().mockResolvedValue({
        toString: () => "Mocked engine response",
      }),
    };

    const result = await getAnswer(engine, "Explain RAG");
    expect(result).toBe("Mocked engine response");
  });
});
  1. If you prefer to keep one integration-style test against real LlamaIndex objects but still avoid external LLM calls, inject a fake response at the boundary. That lets you verify your wiring without paying for model access.
// src/integration-like.test.ts
import { describe, expect, it } from "vitest";
import { Document, VectorStoreIndex } from "llamaindex";

describe("VectorStoreIndex wiring", () => {
  it("can build an index without calling an external model", async () => {
    const docs = [new Document({ text: "Testing LlamaIndex locally." })];
    const index = await VectorStoreIndex.fromDocuments(docs);

    const engine = index.asQueryEngine();
    expect(engine).toBeTruthy();
  });
});

Testing It

Run your test suite with your normal command, for example npx vitest run. The mocked tests should pass instantly and should not require any environment variables or network access.

If a test hangs or tries to reach an API, that means your application code is still creating a real model client somewhere outside the mocked boundary. Move that dependency behind an interface or inject it into the function under test.

A good sanity check is to run the suite with no .env file loaded. If everything still passes, your mocks are doing their job.

Next Steps

  • Learn dependency injection for LlamaIndex service wrappers so you can swap real and fake engines cleanly.
  • Add snapshot tests for prompt formatting if your app builds custom prompts before querying.
  • Move from unit mocks to contract tests that validate output shape without hitting production APIs.

Keep learning

By Cyprian Aarons, AI Consultant at Topiax.

Want the complete 8-step roadmap?

Grab the free AI Agent Starter Kit — architecture templates, compliance checklists, and a 7-email deep-dive course.

Get the Starter Kit

Related Guides