LangChain Tutorial (Python): adding observability for advanced developers

By Cyprian AaronsUpdated 2026-04-21
langchainadding-observability-for-advanced-developerspython

This tutorial shows you how to add real observability to a LangChain Python app: tracing, metadata, tags, and structured run visibility through LangSmith. You need this when chains start growing beyond a single prompt and you need to debug latency, tool calls, prompt drift, and bad outputs without guessing.

What You'll Need

  • Python 3.10+
  • langchain
  • langchain-openai
  • langsmith
  • An OpenAI API key
  • A LangSmith API key
  • A LangSmith project name
  • Basic familiarity with Runnable, ChatPromptTemplate, and StrOutputParser

Install the packages:

pip install langchain langchain-openai langsmith

Step-by-Step

  1. Start by setting your environment variables. LangChain will pick these up automatically and send traces to LangSmith when enabled.
import os

os.environ["OPENAI_API_KEY"] = "your-openai-api-key"
os.environ["LANGSMITH_API_KEY"] = "your-langsmith-api-key"
os.environ["LANGSMITH_TRACING"] = "true"
os.environ["LANGSMITH_PROJECT"] = "langchain-observability-tutorial"
  1. Build a small chain with explicit metadata and tags. This is where observability starts paying off: every run becomes searchable by service name, environment, or request type.
from langchain_openai import ChatOpenAI
from langchain_core.prompts import ChatPromptTemplate
from langchain_core.output_parsers import StrOutputParser

prompt = ChatPromptTemplate.from_messages([
    ("system", "You are a concise assistant for internal banking support."),
    ("human", "Explain {topic} in one paragraph.")
])

llm = ChatOpenAI(model="gpt-4o-mini", temperature=0)

chain = (
    prompt
    | llm
    | StrOutputParser()
).with_config(
    tags=["banking", "support", "observability"],
    metadata={"service": "knowledge-assistant", "env": "dev"}
)
  1. Invoke the chain with per-request context. Use config for request-level metadata so you can trace individual calls without changing the chain definition.
result = chain.invoke(
    {"topic": "AML transaction monitoring"},
    config={
        "tags": ["request:demo"],
        "metadata": {
            "user_id": "u_12345",
            "tenant_id": "tenant_acme",
            "trace_source": "tutorial"
        }
    }
)

print(result)
  1. Add a tool-style step so you can see multi-stage execution in traces. Advanced debugging usually means understanding where time is spent between model calls, transforms, and external lookups.
from langchain_core.runnables import RunnableLambda

def normalize_topic(inputs: dict) -> dict:
    topic = inputs["topic"].strip().lower()
    return {"topic": topic}

normalize_chain = RunnableLambda(normalize_topic)

observed_chain = (
    normalize_chain
    | prompt
    | llm
    | StrOutputParser()
).with_config(
    tags=["normalize", "llm-path"],
    metadata={"service": "knowledge-assistant", "env": "dev"}
)

print(observed_chain.invoke({"topic": "  Fraud Detection  "}))

  1. If you want deeper visibility on custom code paths, wrap them as runnables instead of hiding them in plain Python functions. That keeps the execution graph visible in LangSmith instead of flattening everything into one opaque step.
from langchain_core.runnables import RunnablePassthrough

def add_audit_fields(inputs: dict) -> dict:
    return {
        **inputs,
        "audit_context": f'{inputs.get("tenant_id")}:{inputs.get("user_id")}'
    }

audit_chain = (
    RunnablePassthrough.assign(audit=RunnableLambda(add_audit_fields))
    | RunnableLambda(lambda x: x["audit"])
)

print(
    audit_chain.invoke({
        "tenant_id": "tenant_acme",
        "user_id": "u_12345"
    })
)

Testing It

Run the script and make sure it prints model output without errors. Then open LangSmith and confirm you see separate traces for the chain invocation, including the tags and metadata you set at both chain and request level.

Check that the trace graph shows each runnable step instead of collapsing everything into one node. If you added custom functions as RunnableLambda, verify they appear as distinct spans with their own timing.

For production debugging, compare two runs with different metadata values like tenant_id or request:demo. That gives you a clean way to filter failures by customer, environment, or workflow type.

Next Steps

  • Add callback handlers for custom logging into your existing observability stack.
  • Learn how to trace agents and tools so tool selection errors are visible.
  • Store prompt versions in metadata so you can compare outputs across releases.

Keep learning

By Cyprian Aarons, AI Consultant at Topiax.

Want the complete 8-step roadmap?

Grab the free AI Agent Starter Kit — architecture templates, compliance checklists, and a 7-email deep-dive course.

Get the Starter Kit

Related Guides