LangGraph Tutorial (Python): mocking LLM calls in tests for intermediate developers
This tutorial shows how to test a LangGraph workflow without calling a real LLM. You need this when you want fast, deterministic tests for agent logic, branching, and state updates without burning API credits or waiting on network calls.
What You'll Need
- •Python 3.10+
- •
langgraph - •
langchain-core - •
pytest - •No API key required for the mocked tests
- •Optional:
langchain-openaiif you want to compare against a real model later
Install the packages:
pip install langgraph langchain-core pytest
Step-by-Step
- •Start with a small graph that calls an LLM through a node function. The important part is that the node depends on an injected callable, not on a hardcoded model client.
from typing import TypedDict, Annotated
from langgraph.graph import StateGraph, START, END
from langgraph.graph.message import add_messages
from langchain_core.messages import HumanMessage, AIMessage
class State(TypedDict):
messages: Annotated[list, add_messages]
def make_graph(llm_callable):
def call_llm(state: State):
response = llm_callable(state["messages"])
return {"messages": [AIMessage(content=response)]}
builder = StateGraph(State)
builder.add_node("call_llm", call_llm)
builder.add_edge(START, "call_llm")
builder.add_edge("call_llm", END)
return builder.compile()
- •Define a fake LLM for tests. This keeps your graph execution fully local and makes the output predictable enough to assert on.
def fake_llm(messages):
last_user_message = messages[-1].content
return f"mocked reply to: {last_user_message}"
graph = make_graph(fake_llm)
result = graph.invoke(
{"messages": [HumanMessage(content="What is my policy status?")]}
)
print(result["messages"][-1].content)
- •Write a real pytest test around the graph. The test should validate both the mocked response and the fact that LangGraph preserved message state correctly.
from langchain_core.messages import HumanMessage
def test_graph_uses_mocked_llm():
def fake_llm(messages):
return "approved"
graph = make_graph(fake_llm)
result = graph.invoke(
{"messages": [HumanMessage(content="Check claim status")]}
)
assert result["messages"][-1].content == "approved"
assert len(result["messages"]) == 2
- •If your graph has branches, mock them the same way: inject behavior at the node boundary. This pattern scales better than patching deep SDK internals because your test stays focused on graph behavior.
from typing import TypedDict, Annotated
from langgraph.graph import StateGraph, START, END
from langgraph.graph.message import add_messages
from langchain_core.messages import HumanMessage, AIMessage
class BranchState(TypedDict):
messages: Annotated[list, add_messages]
route: str
def make_branching_graph(router_fn, llm_fn):
def route_node(state: BranchState):
return {"route": router_fn(state["messages"])}
def answer_node(state: BranchState):
text = llm_fn(state["messages"])
return {"messages": [AIMessage(content=text)]}
builder = StateGraph(BranchState)
builder.add_node("route_node", route_node)
builder.add_node("answer_node", answer_node)
builder.add_edge(START, "route_node")
builder.add_edge("route_node", "answer_node")
builder.add_edge("answer_node", END)
return builder.compile()
- •Use parameterized tests when you want to cover multiple prompts or outputs. This is where mocking pays off: one test file can cover several agent paths without any network dependency.
import pytest
from langchain_core.messages import HumanMessage
@pytest.mark.parametrize(
"prompt,expected",
[
("policy renewal date", "renewal path"),
("claim denied reason", "claims path"),
],
)
def test_multiple_prompts(prompt, expected):
def fake_llm(messages):
if "renewal" in messages[-1].content:
return "renewal path"
return "claims path"
graph = make_graph(fake_llm)
result = graph.invoke({"messages": [HumanMessage(content=prompt)]})
assert result["messages"][-1].content == expected
Testing It
Run pytest -q from the project root. If everything is wired correctly, the tests should pass instantly because there are no external API calls involved.
If you want to sanity-check the runtime behavior manually, run one of the scripts directly and inspect the final AI message content. The key thing to verify is that the graph output changes only when your fake function changes.
For more confidence, add assertions around intermediate state keys like route or custom metadata fields. That catches regressions in graph wiring before they become production incidents.
Next Steps
- •Mock tool nodes the same way you mocked LLM nodes so your agent tests cover tool routing too.
- •Add snapshot-style assertions for full LangGraph state when your workflows get more complex.
- •Compare this pattern with dependency injection around
ChatOpenAIso you can swap between real and fake models cleanly.
Keep learning
- •The complete AI Agents Roadmap — my full 8-step breakdown
- •Free: The AI Agent Starter Kit — PDF checklist + starter code
- •Work with me — I build AI for banks and insurance companies
By Cyprian Aarons, AI Consultant at Topiax.
Want the complete 8-step roadmap?
Grab the free AI Agent Starter Kit — architecture templates, compliance checklists, and a 7-email deep-dive course.
Get the Starter Kit