LangGraph Tutorial (Python): mocking LLM calls in tests for beginners

By Cyprian AaronsUpdated 2026-04-22
langgraphmocking-llm-calls-in-tests-for-beginnerspython

This tutorial shows you how to write LangGraph tests without calling a real LLM. You’ll build a small graph, replace the model with a mock in tests, and verify your workflow logic deterministically.

What You'll Need

  • Python 3.10+
  • langgraph
  • langchain-core
  • pytest
  • Optional: langchain-openai if you want to compare against a real model later
  • No API key is required for the test setup in this tutorial

Install the packages:

pip install langgraph langchain-core pytest

Step-by-Step

  1. Start with a tiny graph that calls an LLM-like function through dependency injection.
    The important part is that the graph does not know whether it is using a real model or a fake one.
from typing import TypedDict, Annotated
from langgraph.graph import StateGraph, START, END
from langchain_core.messages import HumanMessage, AIMessage


class ChatState(TypedDict):
    messages: Annotated[list, list]


def build_graph(llm):
    def call_model(state: ChatState):
        response = llm.invoke(state["messages"])
        return {"messages": state["messages"] + [response]}

    builder = StateGraph(ChatState)
    builder.add_node("call_model", call_model)
    builder.add_edge(START, "call_model")
    builder.add_edge("call_model", END)
    return builder.compile()
  1. Create a fake LLM for tests.
    This object only needs an invoke() method that returns an AIMessage, which keeps your test fast and fully deterministic.
from langchain_core.messages import AIMessage


class FakeLLM:
    def __init__(self, reply: str):
        self.reply = reply
        self.calls = []

    def invoke(self, messages):
        self.calls.append(messages)
        return AIMessage(content=self.reply)
  1. Write a test that runs the graph against the fake model.
    You are testing graph behavior here: did it call the model, and did it append the response correctly?
from langchain_core.messages import HumanMessage


def test_graph_uses_mocked_llm():
    llm = FakeLLM("Hello from mock")
    graph = build_graph(llm)

    result = graph.invoke({
        "messages": [HumanMessage(content="Hi")]
    })

    assert len(llm.calls) == 1
    assert llm.calls[0][0].content == "Hi"
    assert result["messages"][-1].content == "Hello from mock"
  1. Add a second test for branching logic if your graph has conditional paths.
    The same mocking pattern works when your node decides what to do next based on model output.
from typing import Literal


class RoutingState(TypedDict):
    messages: Annotated[list, list]
    route: str


def build_routing_graph(llm):
    def classify(state: RoutingState):
        response = llm.invoke(state["messages"])
        return {"route": response.content}

    def choose_route(state: RoutingState) -> Literal["billing", "support"]:
        return state["route"]

    builder = StateGraph(RoutingState)
    builder.add_node("classify", classify)
    builder.add_conditional_edges("classify", choose_route)
    builder.add_edge(START, "classify")
    builder.add_node("billing", lambda state: state)
    builder.add_node("support", lambda state: state)
    builder.add_edge("billing", END)
    builder.add_edge("support", END)
    return builder.compile()
  1. Run the test file with pytest.
    If everything is wired correctly, the graph should execute without any network calls and your assertions should pass consistently.
# test_graph.py
from langchain_core.messages import HumanMessage

def test_graph_uses_mocked_llm():
    llm = FakeLLM("Hello from mock")
    graph = build_graph(llm)

    result = graph.invoke({"messages": [HumanMessage(content="Hi")]})

    assert len(llm.calls) == 1
    assert result["messages"][-1].content == "Hello from mock"

Testing It

Run:

pytest -q

If the test passes, your LangGraph workflow is isolated from external model behavior. That means failures now point to your graph logic, not network latency, rate limits, or prompt drift.

A good sanity check is to change the fake reply and confirm your assertion fails. If it does, your test is actually validating behavior instead of just exercising code paths.

Next Steps

  • Replace the hand-written fake with unittest.mock.Mock once you want stricter call assertions.
  • Add tests for tool nodes and conditional routing so each branch is covered.
  • Compare this pattern with integration tests that hit a real LLM in CI only on scheduled runs.

Keep learning

By Cyprian Aarons, AI Consultant at Topiax.

Want the complete 8-step roadmap?

Grab the free AI Agent Starter Kit — architecture templates, compliance checklists, and a 7-email deep-dive course.

Get the Starter Kit

Related Guides