LangChain Tutorial (Python): adding audit logs for advanced developers
This tutorial shows how to add audit logging to a LangChain Python app so every model call, tool invocation, and final answer can be traced later. You need this when you’re building systems for regulated environments, incident review, or just want a reliable paper trail for agent behavior.
What You'll Need
- •Python 3.10+
- •
langchain - •
langchain-openai - •
openaiAPI key - •A writable log destination on disk
- •Basic familiarity with LangChain chains and prompts
Install the packages:
pip install langchain langchain-openai openai
Set your API key:
export OPENAI_API_KEY="your-key-here"
Step-by-Step
- •Start with a structured audit event model.
Don’t dump raw strings into logs and call it done. Use a JSON-friendly schema so downstream systems can filter by session, event type, latency, and token usage.
from dataclasses import asdict, dataclass
from datetime import datetime, timezone
import json
from pathlib import Path
AUDIT_FILE = Path("audit.log")
@dataclass
class AuditEvent:
timestamp: str
session_id: str
event_type: str
component: str
payload: dict
def write_audit(event: AuditEvent) -> None:
with AUDIT_FILE.open("a", encoding="utf-8") as f:
f.write(json.dumps(asdict(event), ensure_ascii=False) + "\n")
def now_utc() -> str:
return datetime.now(timezone.utc).isoformat()
- •Build a callback handler that captures chain lifecycle events.
LangChain callbacks are the right place to hook into execution without polluting business logic. This handler records prompt inputs, outputs, and errors in a single append-only log.
import uuid
from langchain_core.callbacks import BaseCallbackHandler
class AuditCallbackHandler(BaseCallbackHandler):
def __init__(self, session_id: str):
self.session_id = session_id
def on_chain_start(self, serialized, inputs, **kwargs):
write_audit(AuditEvent(
timestamp=now_utc(),
session_id=self.session_id,
event_type="chain_start",
component=serialized.get("name", "unknown"),
payload={"inputs": inputs},
))
def on_chain_end(self, outputs, **kwargs):
write_audit(AuditEvent(
timestamp=now_utc(),
session_id=self.session_id,
event_type="chain_end",
component="chain",
payload={"outputs": outputs},
))
def on_chain_error(self, error, **kwargs):
write_audit(AuditEvent(
timestamp=now_utc(),
session_id=self.session_id,
event_type="chain_error",
component="chain",
payload={"error": str(error)},
))
- •Wire the callback into a real LangChain chat pipeline.
This example usesChatOpenAI,ChatPromptTemplate, andStrOutputParser. The important part is passing the callback throughconfig, which keeps the audit layer separate from the chain definition.
from langchain_openai import ChatOpenAI
from langchain_core.output_parsers import StrOutputParser
from langchain_core.prompts import ChatPromptTemplate
session_id = str(uuid.uuid4())
audit_handler = AuditCallbackHandler(session_id=session_id)
prompt = ChatPromptTemplate.from_messages([
("system", "You are a concise assistant."),
("user", "Answer this in one sentence: {question}")
])
llm = ChatOpenAI(model="gpt-4o-mini", temperature=0)
chain = prompt | llm | StrOutputParser()
result = chain.invoke(
{"question": "Why do audit logs matter in regulated AI systems?"},
config={"callbacks": [audit_handler]}
)
print(result)
- •Add tool-level logging if your agent calls external functions.
For advanced developers, chain-level logs are not enough. If the model can trigger tools, you want each tool input and output recorded separately so you can reconstruct decisions later.
from langchain_core.tools import tool
@tool
def lookup_policy(policy_id: str) -> str:
write_audit(AuditEvent(
timestamp=now_utc(),
session_id=session_id,
event_type="tool_start",
component="lookup_policy",
payload={"policy_id": policy_id},
))
result = f"Policy {policy_id}: active"
write_audit(AuditEvent(
timestamp=now_utc(),
session_id=session_id,
event_type="tool_end",
component="lookup_policy",
payload={"result": result},
))
return result
print(lookup_policy.invoke("POL-12345"))
- •Log token usage and latency at the edge of your application.
Audit trails are more useful when they include operational data. Even if you don’t store full prompts in production, you should still capture timing and usage metrics for each request.
import time
start = time.perf_counter()
response = chain.invoke(
{"question": "What should we store in an AI audit log?"},
config={"callbacks": [audit_handler]}
)
elapsed_ms = round((time.perf_counter() - start) * 1000, 2)
write_audit(AuditEvent(
timestamp=now_utc(),
session_id=session_id,
event_type="request_metrics",
component="app",
payload={
"latency_ms": elapsed_ms,
"response_preview": response[:120],
},
))
print(f"{response}\nLatency: {elapsed_ms} ms")
Testing It
Run the script once and check that audit.log contains one JSON object per line. You should see chain_start, chain_end, and any tool_* events in order for the same session_id.
If you want to validate failure handling, temporarily point the model to an invalid API key or force a bad input and confirm that chain_error is written. For production use, parse the file with a JSONL reader and verify that timestamps are UTC ISO-8601 strings.
A quick sanity check is to search by session_id and reconstruct the full request path from start to finish. If you can do that reliably, your audit trail is doing its job.
Next Steps
- •Add redaction for PII before writing prompts or tool arguments to disk.
- •Ship audit events to OpenTelemetry or a SIEM instead of local files.
- •Extend the schema with user IDs, tenant IDs, model name, and prompt versioning.
Keep learning
- •The complete AI Agents Roadmap — my full 8-step breakdown
- •Free: The AI Agent Starter Kit — PDF checklist + starter code
- •Work with me — I build AI for banks and insurance companies
By Cyprian Aarons, AI Consultant at Topiax.
Want the complete 8-step roadmap?
Grab the free AI Agent Starter Kit — architecture templates, compliance checklists, and a 7-email deep-dive course.
Get the Starter Kit