LangChain Tutorial (Python): testing agents locally for intermediate developers

By Cyprian AaronsUpdated 2026-04-21
langchaintesting-agents-locally-for-intermediate-developerspython

This tutorial shows you how to test a LangChain agent locally in Python without wiring it into a full app or deploying anything. You’ll build a small agent, run it against deterministic local tools, and verify behavior with repeatable tests before you ever connect real APIs.

What You'll Need

  • Python 3.10+
  • pip
  • A virtual environment
  • Packages:
    • langchain
    • langchain-openai
    • langchain-community
    • pytest
    • python-dotenv
  • An OpenAI API key if you want to test with a real chat model
  • Optional but useful:
    • pytest-cov
    • ruff

Install the dependencies:

pip install langchain langchain-openai langchain-community pytest python-dotenv

Step-by-Step

  1. Start by creating a tiny project layout and loading config from .env. For local agent testing, keep your secrets out of code and make the model choice configurable.
# app.py
import os
from dotenv import load_dotenv

load_dotenv()

OPENAI_API_KEY = os.getenv("OPENAI_API_KEY")
MODEL_NAME = os.getenv("MODEL_NAME", "gpt-4o-mini")

if not OPENAI_API_KEY:
    raise RuntimeError("OPENAI_API_KEY is missing")
  1. Define a deterministic tool that your agent can call during tests. For local testing, tools should be boring and predictable so you can assert on exact outputs.
# app.py
from langchain_core.tools import tool

@tool
def add_numbers(a: int, b: int) -> str:
    """Add two integers and return the result as text."""
    return str(a + b)

@tool
def get_policy_status(policy_id: str) -> str:
    """Return a fake policy status for local testing."""
    if policy_id == "POL123":
        return "policy POL123 is active"
    return f"policy {policy_id} not found"
  1. Build the agent with a real LangChain chat model and bind the tools to it. This gives you production-shaped behavior while still letting you run everything from your laptop.
# app.py
from langchain_openai import ChatOpenAI
from langchain.agents import create_tool_calling_agent, AgentExecutor
from langchain_core.prompts import ChatPromptTemplate, MessagesPlaceholder

prompt = ChatPromptTemplate.from_messages([
    ("system", "You are a helpful assistant. Use tools when needed."),
    ("human", "{input}"),
    MessagesPlaceholder(variable_name="agent_scratchpad"),
])

llm = ChatOpenAI(model=MODEL_NAME, temperature=0)
tools = [add_numbers, get_policy_status]

agent = create_tool_calling_agent(llm, tools, prompt)
executor = AgentExecutor(agent=agent, tools=tools, verbose=True)
  1. Add a small entry point so you can run manual checks locally before writing tests. Manual runs catch obvious prompt or tool wiring issues fast.
# app.py
def main() -> None:
    questions = [
        "What is 2 plus 3?",
        "Check policy POL123",
        "Check policy XYZ999",
    ]

    for question in questions:
        result = executor.invoke({"input": question})
        print("\nINPUT:", question)
        print("OUTPUT:", result["output"])

if __name__ == "__main__":
    main()
  1. Write tests that validate both tool behavior and agent behavior. The key here is to test the shape of the output and whether the agent routes correctly to your local tools.
# test_app.py
from app import add_numbers, get_policy_status

def test_add_numbers():
    assert add_numbers.invoke({"a": 2, "b": 3}) == "5"

def test_get_policy_status_found():
    assert get_policy_status.invoke({"policy_id": "POL123"}) == "policy POL123 is active"

def test_get_policy_status_missing():
    assert get_policy_status.invoke({"policy_id": "ABC"}) == "policy ABC not found"
  1. Run the tests and then exercise the agent manually. If you want fully local execution without calling an external LLM during tests, keep those tests focused on tool functions and use integration runs separately.
pytest -q
python app.py

Testing It

First verify that the unit tests pass consistently; these should never depend on network calls or model randomness. Then run python app.py and check that the agent uses add_numbers for arithmetic and get_policy_status for policy lookups.

If the model ignores tools or hallucinates answers, lower temperature to 0, simplify the system prompt, and make sure your tool docstrings are specific. For intermediate-level testing, separate pure tool tests from agent integration checks so failures point to one layer instead of everything at once.

Next Steps

  • Add mocked LLM tests using LangChain’s message interfaces so you can test routing without hitting OpenAI.
  • Wrap this in pytest fixtures and snapshot assertions for more stable regression testing.
  • Extend the tool set with insurance-specific functions like quote lookup, claim status, or document retrieval.

Keep learning

By Cyprian Aarons, AI Consultant at Topiax.

Want the complete 8-step roadmap?

Grab the free AI Agent Starter Kit — architecture templates, compliance checklists, and a 7-email deep-dive course.

Get the Starter Kit

Related Guides