LangChain Tutorial (Python): adding audit logs for advanced developers

By Cyprian AaronsUpdated 2026-04-21
langchainadding-audit-logs-for-advanced-developerspython

This tutorial shows how to add audit logging to a LangChain Python app so every model call, tool invocation, and final answer can be traced later. You need this when you’re building systems for regulated environments, incident review, or just want a reliable paper trail for agent behavior.

What You'll Need

  • Python 3.10+
  • langchain
  • langchain-openai
  • openai API key
  • A writable log destination on disk
  • Basic familiarity with LangChain chains and prompts

Install the packages:

pip install langchain langchain-openai openai

Set your API key:

export OPENAI_API_KEY="your-key-here"

Step-by-Step

  1. Start with a structured audit event model.
    Don’t dump raw strings into logs and call it done. Use a JSON-friendly schema so downstream systems can filter by session, event type, latency, and token usage.
from dataclasses import asdict, dataclass
from datetime import datetime, timezone
import json
from pathlib import Path

AUDIT_FILE = Path("audit.log")

@dataclass
class AuditEvent:
    timestamp: str
    session_id: str
    event_type: str
    component: str
    payload: dict

def write_audit(event: AuditEvent) -> None:
    with AUDIT_FILE.open("a", encoding="utf-8") as f:
        f.write(json.dumps(asdict(event), ensure_ascii=False) + "\n")

def now_utc() -> str:
    return datetime.now(timezone.utc).isoformat()
  1. Build a callback handler that captures chain lifecycle events.
    LangChain callbacks are the right place to hook into execution without polluting business logic. This handler records prompt inputs, outputs, and errors in a single append-only log.
import uuid
from langchain_core.callbacks import BaseCallbackHandler

class AuditCallbackHandler(BaseCallbackHandler):
    def __init__(self, session_id: str):
        self.session_id = session_id

    def on_chain_start(self, serialized, inputs, **kwargs):
        write_audit(AuditEvent(
            timestamp=now_utc(),
            session_id=self.session_id,
            event_type="chain_start",
            component=serialized.get("name", "unknown"),
            payload={"inputs": inputs},
        ))

    def on_chain_end(self, outputs, **kwargs):
        write_audit(AuditEvent(
            timestamp=now_utc(),
            session_id=self.session_id,
            event_type="chain_end",
            component="chain",
            payload={"outputs": outputs},
        ))

    def on_chain_error(self, error, **kwargs):
        write_audit(AuditEvent(
            timestamp=now_utc(),
            session_id=self.session_id,
            event_type="chain_error",
            component="chain",
            payload={"error": str(error)},
        ))
  1. Wire the callback into a real LangChain chat pipeline.
    This example uses ChatOpenAI, ChatPromptTemplate, and StrOutputParser. The important part is passing the callback through config, which keeps the audit layer separate from the chain definition.
from langchain_openai import ChatOpenAI
from langchain_core.output_parsers import StrOutputParser
from langchain_core.prompts import ChatPromptTemplate

session_id = str(uuid.uuid4())
audit_handler = AuditCallbackHandler(session_id=session_id)

prompt = ChatPromptTemplate.from_messages([
    ("system", "You are a concise assistant."),
    ("user", "Answer this in one sentence: {question}")
])

llm = ChatOpenAI(model="gpt-4o-mini", temperature=0)
chain = prompt | llm | StrOutputParser()

result = chain.invoke(
    {"question": "Why do audit logs matter in regulated AI systems?"},
    config={"callbacks": [audit_handler]}
)

print(result)
  1. Add tool-level logging if your agent calls external functions.
    For advanced developers, chain-level logs are not enough. If the model can trigger tools, you want each tool input and output recorded separately so you can reconstruct decisions later.
from langchain_core.tools import tool

@tool
def lookup_policy(policy_id: str) -> str:
    write_audit(AuditEvent(
        timestamp=now_utc(),
        session_id=session_id,
        event_type="tool_start",
        component="lookup_policy",
        payload={"policy_id": policy_id},
    ))
    result = f"Policy {policy_id}: active"
    write_audit(AuditEvent(
        timestamp=now_utc(),
        session_id=session_id,
        event_type="tool_end",
        component="lookup_policy",
        payload={"result": result},
    ))
    return result

print(lookup_policy.invoke("POL-12345"))
  1. Log token usage and latency at the edge of your application.
    Audit trails are more useful when they include operational data. Even if you don’t store full prompts in production, you should still capture timing and usage metrics for each request.
import time

start = time.perf_counter()
response = chain.invoke(
    {"question": "What should we store in an AI audit log?"},
    config={"callbacks": [audit_handler]}
)
elapsed_ms = round((time.perf_counter() - start) * 1000, 2)

write_audit(AuditEvent(
    timestamp=now_utc(),
    session_id=session_id,
    event_type="request_metrics",
    component="app",
    payload={
        "latency_ms": elapsed_ms,
        "response_preview": response[:120],
    },
))

print(f"{response}\nLatency: {elapsed_ms} ms")

Testing It

Run the script once and check that audit.log contains one JSON object per line. You should see chain_start, chain_end, and any tool_* events in order for the same session_id.

If you want to validate failure handling, temporarily point the model to an invalid API key or force a bad input and confirm that chain_error is written. For production use, parse the file with a JSONL reader and verify that timestamps are UTC ISO-8601 strings.

A quick sanity check is to search by session_id and reconstruct the full request path from start to finish. If you can do that reliably, your audit trail is doing its job.

Next Steps

  • Add redaction for PII before writing prompts or tool arguments to disk.
  • Ship audit events to OpenTelemetry or a SIEM instead of local files.
  • Extend the schema with user IDs, tenant IDs, model name, and prompt versioning.

Keep learning

By Cyprian Aarons, AI Consultant at Topiax.

Want the complete 8-step roadmap?

Grab the free AI Agent Starter Kit — architecture templates, compliance checklists, and a 7-email deep-dive course.

Get the Starter Kit

Related Guides