LangGraph Tutorial (Python): testing agents locally for advanced developers
This tutorial shows you how to run and test a LangGraph agent locally in Python, with the same graph structure you would use in production. The goal is to make debugging deterministic: you can inspect state, mock model calls, and verify tool execution without pushing anything to a remote service.
What You'll Need
- •Python 3.10+
- •
langgraph - •
langchain-core - •
langchain-openai - •
pytest - •An OpenAI API key set as
OPENAI_API_KEYif you want to run real model calls - •Optional but useful:
- •
python-dotenvfor local env loading - •
pydanticfor structured state models
- •
Install the packages:
pip install langgraph langchain-core langchain-openai pytest python-dotenv
Step-by-Step
- •Start with a minimal graph state and one tool. For local testing, keep the state small and explicit so assertions are easy.
from typing import Annotated, TypedDict
from langchain_core.messages import HumanMessage, AIMessage
from langchain_core.tools import tool
from langgraph.graph import StateGraph, START, END
from langgraph.graph.message import add_messages
class State(TypedDict):
messages: Annotated[list, add_messages]
@tool
def lookup_policy(policy_id: str) -> str:
"""Return a fake policy status for local testing."""
return f"Policy {policy_id} is active"
def should_continue(state: State) -> str:
last = state["messages"][-1]
if isinstance(last, AIMessage) and last.tool_calls:
return "tools"
return END
- •Build a graph that can call tools and then return control to the model. This pattern is what you want to test locally before wiring in real dependencies.
import os
from langchain_openai import ChatOpenAI
from langgraph.prebuilt import ToolNode
llm = ChatOpenAI(model="gpt-4o-mini", temperature=0)
tools = [lookup_policy]
llm_with_tools = llm.bind_tools(tools)
def call_model(state: State) -> dict:
response = llm_with_tools.invoke(state["messages"])
return {"messages": [response]}
graph = StateGraph(State)
graph.add_node("agent", call_model)
graph.add_node("tools", ToolNode(tools))
graph.add_edge(START, "agent")
graph.add_conditional_edges("agent", should_continue, {"tools": "tools", END: END})
graph.add_edge("tools", "agent")
app = graph.compile()
- •Run the graph once with a real prompt and inspect the output messages. For local testing, print the full message trace so you can see whether tool calls happened where you expected.
if __name__ == "__main__":
result = app.invoke(
{"messages": [HumanMessage(content="Check policy 12345 status")]}
)
for msg in result["messages"]:
print(type(msg).__name__, "=>", msg.content)
if hasattr(msg, "tool_calls") and msg.tool_calls:
print("tool_calls =>", msg.tool_calls)
- •Add a deterministic test by mocking the model instead of calling OpenAI. This is the part most teams skip, and it is exactly what makes local agent testing reliable.
from langchain_core.language_models.fake_chat_models import FakeMessagesListChatModel
fake_model = FakeMessagesListChatModel(
responses=[
AIMessage(
content="",
tool_calls=[{"name": "lookup_policy", "args": {"policy_id": "12345"}, "id": "call_1"}],
),
AIMessage(content="Policy 12345 is active"),
]
)
fake_llm_with_tools = fake_model.bind_tools(tools)
def fake_call_model(state: State) -> dict:
response = fake_llm_with_tools.invoke(state["messages"])
return {"messages": [response]}
- •Swap the node implementation and assert on behavior with
pytest. You want tests that verify both the final answer and the intermediate tool invocation path.
def build_app(call_model_fn):
g = StateGraph(State)
g.add_node("agent", call_model_fn)
g.add_node("tools", ToolNode(tools))
g.add_edge(START, "agent")
g.add_conditional_edges("agent", should_continue, {"tools": "tools", END: END})
g.add_edge("tools", "agent")
return g.compile()
def test_agent_tool_flow():
test_app = build_app(fake_call_model)
result = test_app.invoke({"messages": [HumanMessage(content="Check policy 12345 status")]})
final_message = result["messages"][-1]
assert isinstance(final_message, AIMessage)
assert final_message.content == "Policy 12345 is active"
Testing It
Run the script once with your real model configured and confirm that the message trace includes an assistant tool call followed by a tool response. Then run pytest and verify the deterministic test passes without any network dependency.
If your graph hangs or loops forever, your conditional edge logic is wrong or your model keeps emitting tool calls after the tool result. In practice, I also log state["messages"][-3:] during development because it makes bad transitions obvious fast.
Next Steps
- •Add more tools and test each one with isolated fake responses.
- •Move from plain
TypedDictstate to Pydantic models when your agent state gets larger. - •Test streaming with
.stream()so you can verify partial outputs before shipping to production.
Keep learning
- •The complete AI Agents Roadmap — my full 8-step breakdown
- •Free: The AI Agent Starter Kit — PDF checklist + starter code
- •Work with me — I build AI for banks and insurance companies
By Cyprian Aarons, AI Consultant at Topiax.
Want the complete 8-step roadmap?
Grab the free AI Agent Starter Kit — architecture templates, compliance checklists, and a 7-email deep-dive course.
Get the Starter Kit