LangGraph Tutorial (Python): testing agents locally for intermediate developers

By Cyprian AaronsUpdated 2026-04-22

langgraphtesting-agents-locally-for-intermediate-developerspython

This tutorial shows you how to build a small LangGraph agent in Python and test it locally without wiring up your full app. You’ll get a repeatable setup for running the graph, inspecting state transitions, and catching bad tool behavior before you deploy.

What You'll Need

•Python 3.10+
•langgraph
•langchain-openai
•python-dotenv
•An OpenAI API key in your environment as OPENAI_API_KEY
•A local terminal and a text editor
•Basic familiarity with LangGraph nodes, edges, and state

Step-by-Step

•Create a minimal project and install dependencies. Keep this isolated so your local tests don’t depend on unrelated app code.

python -m venv .venv
source .venv/bin/activate
pip install langgraph langchain-openai python-dotenv

•Define a tiny graph with one model node and one tool node. This example uses a calculator tool so you can test routing and state updates locally without external services beyond the model call.

from typing import Annotated, TypedDict

from dotenv import load_dotenv
from langchain_core.messages import BaseMessage, HumanMessage, SystemMessage
from langchain_core.tools import tool
from langgraph.graph import StateGraph, START, END
from langgraph.graph.message import add_messages
from langgraph.prebuilt import ToolNode, tools_condition
from langchain_openai import ChatOpenAI

load_dotenv()

class State(TypedDict):
    messages: Annotated[list[BaseMessage], add_messages]

@tool
def calculator(expression: str) -> str:
    """Evaluate a simple math expression."""
    return str(eval(expression, {"__builtins__": {}}))

llm = ChatOpenAI(model="gpt-4o-mini", temperature=0).bind_tools([calculator])

def call_model(state: State):
    response = llm.invoke(state["messages"])
    return {"messages": [response]}

•Wire the graph together exactly like you would in production, but keep the surface area small. The important part here is that the graph can route to tools when needed and stop cleanly when no tool is required.

builder = StateGraph(State)

builder.add_node("agent", call_model)
builder.add_node("tools", ToolNode([calculator]))

builder.add_edge(START, "agent")
builder.add_conditional_edges("agent", tools_condition)
builder.add_edge("tools", "agent")

graph = builder.compile()

•Run a local test case with a known input and inspect the final messages. This gives you a deterministic way to validate the graph behavior before connecting it to any UI or API layer.

if __name__ == "__main__":
    result = graph.invoke(
        {
            "messages": [
                SystemMessage(content="You are a precise assistant."),
                HumanMessage(content="What is 17 * 19? Use the calculator."),
            ]
        }
    )

    for message in result["messages"]:
        print(f"{message.__class__.__name__}: {message.content}")

•Add an intermediate-level test harness that checks both normal responses and tool use. In practice, this is where local testing pays off: you can catch broken prompts, bad routing, or tools returning the wrong shape.

def run_case(prompt: str):
    result = graph.invoke(
        {"messages": [HumanMessage(content=prompt)]}
    )
    return result["messages"][-1].content

cases = [
    "Say hello in one sentence.",
    "What is 12 * 11? Use the calculator.",
]

for case in cases:
    print("INPUT:", case)
    print("OUTPUT:", run_case(case))
    print("-" * 40)

•If you want faster debugging, stream the execution instead of only looking at the final output. Streaming lets you see each state transition locally, which is useful when an agent loops or calls tools unexpectedly.

events = graph.stream(
    {
        "messages": [
            HumanMessage(content="Compute 8 * 13 using the calculator.")
        ]
    }
)

for event in events:
    print(event)

Testing It

Run the script from your virtual environment and confirm that simple prompts return direct answers while math prompts trigger the calculator tool. If the tool path is working, you should see an intermediate tool call followed by a final assistant message with the computed result.

A good local test also checks failure modes. Try changing the prompt to something ambiguous and make sure your agent still exits cleanly instead of looping forever.

If you want more confidence, wrap run_case() in pytest assertions and snapshot the output structure. That gives you repeatable regression tests every time you change prompts, tools, or routing logic.

Next Steps

•Add pytest tests around graph.invoke() outputs and message counts.
•Replace eval() with a safe math parser before using this pattern anywhere serious.
•Learn LangGraph checkpointing so you can test resumable agent state locally.

Keep learning

•The complete AI Agents Roadmap — my full 8-step breakdown
•Free: The AI Agent Starter Kit — PDF checklist + starter code
•Work with me — I build AI for banks and insurance companies

By Cyprian Aarons, AI Consultant at Topiax.

ShareX / Twitter LinkedIn

Want the complete 8-step roadmap?

Grab the free AI Agent Starter Kit — architecture templates, compliance checklists, and a 7-email deep-dive course.

Get the Starter Kit