Haystack Tutorial (Python): mocking LLM calls in tests for intermediate developers

By Cyprian AaronsUpdated 2026-04-21

haystackmocking-llm-calls-in-tests-for-intermediate-developerspython

This tutorial shows how to write deterministic tests for Haystack pipelines that call LLMs, without hitting a real model endpoint. You’ll mock the generator component so your tests run fast, stay cheap, and don’t fail because of network or provider instability.

What You'll Need

•Python 3.10+
•haystack-ai
•pytest
•A basic Haystack pipeline that uses an LLM-backed component
•No API key required for the mocked test path
•Optional: python-dotenv if you load real credentials in non-test environments

Install the packages:

pip install haystack-ai pytest

Step-by-Step

•Start with a small pipeline that includes an LLM-style component.
For testing, the important part is that the component has a predictable input/output contract.

from haystack import Pipeline, component


@component
class PromptBuilder:
    @component.output_types(prompt=str)
    def run(self, question: str):
        return {"prompt": f"Answer briefly: {question}"}


@component
class FakeLLM:
    @component.output_types(replies=list[str])
    def run(self, prompt: str):
        return {"replies": [f"MOCKED: {prompt}"]}


pipe = Pipeline()
pipe.add_component("builder", PromptBuilder())
pipe.add_component("llm", FakeLLM())
pipe.connect("builder.prompt", "llm.prompt")

result = pipe.run({"builder": {"question": "What is Haystack?"}})
print(result["llm"]["replies"][0])

•In production, you’d swap the fake component for a real generator like OpenAIGenerator.
The point of mocking is not to rewrite your app; it’s to keep the pipeline shape identical while replacing only the network call.

from haystack import Pipeline, component
from haystack.components.builders import PromptBuilder
from haystack.components.generators.openai import OpenAIGenerator


def build_pipeline(generator):
    pipe = Pipeline()
    pipe.add_component("builder", PromptBuilder(template="Answer briefly: {{question}}"))
    pipe.add_component("llm", generator)
    pipe.connect("builder.prompt", "llm.prompt")
    return pipe


# Real usage would look like this:
# generator = OpenAIGenerator(api_key=os.environ["OPENAI_API_KEY"], model="gpt-4o-mini")
# pipe = build_pipeline(generator)

•Write a test double that returns fixed output and records what it received.
This gives you two assertions: the pipeline wiring is correct, and your app sends the prompt you expect.

from haystack import component


@component
class MockGenerator:
    def __init__(self, reply: str):
        self.reply = reply
        self.seen_prompts = []

    @component.output_types(replies=list[str])
    def run(self, prompt: str):
        self.seen_prompts.append(prompt)
        return {"replies": [self.reply]}

•Use the mock in a pytest test and assert on both output and prompt content.
This is the part that makes your tests stable: no external calls, no retries, no rate limits.

from haystack import Pipeline
from haystack.components.builders import PromptBuilder

from test_mock_llm import MockGenerator


def build_pipeline(generator):
    pipe = Pipeline()
    pipe.add_component("builder", PromptBuilder(template="Answer briefly: {{question}}"))
    pipe.add_component("llm", generator)
    pipe.connect("builder.prompt", "llm.prompt")
    return pipe


def test_pipeline_uses_mocked_llm():
    mock_llm = MockGenerator(reply="Haystack is a Python framework for building LLM apps.")
    pipe = build_pipeline(mock_llm)

    result = pipe.run({"builder": {"question": "What is Haystack?"}})

    assert result["llm"]["replies"][0] == "Haystack is a Python framework for building LLM apps."
    assert mock_llm.seen_prompts == ["Answer briefly: What is Haystack?"]

•If your code wraps Haystack behind a service function, mock at that boundary instead of deep inside tests.
That keeps your tests readable and avoids coupling them to internal pipeline structure.

from haystack import Pipeline
from haystack.components.builders import PromptBuilder


def answer_question(question: str, generator):
    pipe = Pipeline()
    pipe.add_component("builder", PromptBuilder(template="Answer briefly: {{question}}"))
    pipe.add_component("llm", generator)
    pipe.connect("builder.prompt", "llm.prompt")

    result = pipe.run({"builder": {"question": question}})
    return result["llm"]["replies"][0]

Testing It

Run your test suite with pytest -q. The test should pass without any environment variables for model access because nothing leaves your process.

If you want to confirm the mock is actually being used, change the expected reply in the assertion and watch it fail immediately. That tells you your test is deterministic and not accidentally calling a live API.

For extra confidence, add one integration test separately with a real generator and mark it as slow or optional. Keep mocked unit tests as your default path.

Next Steps

•Learn how to mock retrieval components like InMemoryBM25Retriever or document stores in Haystack pipelines.
•Add contract tests around prompt templates so changes to prompt text don’t break production behavior.
•Split unit tests from integration tests so only a small subset ever touches real LLM providers.

Keep learning

•The complete AI Agents Roadmap — my full 8-step breakdown
•Free: The AI Agent Starter Kit — PDF checklist + starter code
•Work with me — I build AI for banks and insurance companies

By Cyprian Aarons, AI Consultant at Topiax.

ShareX / Twitter LinkedIn

Want the complete 8-step roadmap?

Grab the free AI Agent Starter Kit — architecture templates, compliance checklists, and a 7-email deep-dive course.

Get the Starter Kit