LangChain Tutorial (Python): adding audit logs for intermediate developers

By Cyprian AaronsUpdated 2026-04-21

langchainadding-audit-logs-for-intermediate-developerspython

This tutorial shows you how to add audit logs to a LangChain Python app so you can record prompts, model outputs, tool calls, and errors in a way that is useful for compliance and debugging. If you are building internal copilots or regulated workflows, this gives you a practical pattern for tracking what the agent saw, decided, and returned.

What You'll Need

•Python 3.10+
•
A LangChain setup with:
- •langchain
- •langchain-openai
•An OpenAI API key set as OPENAI_API_KEY
•
Basic familiarity with:
- •ChatPromptTemplate
- •RunnableLambda
- •StrOutputParser
•
A place to write logs:
- •local file for development
- •centralized logging later if needed

Install the packages:

pip install langchain langchain-openai

Step-by-Step

•Start with a small chain that is easy to observe. The goal is not just to log the final answer, but to capture each intermediate step in a structured way.

import os
from langchain_openai import ChatOpenAI
from langchain_core.prompts import ChatPromptTemplate
from langchain_core.output_parsers import StrOutputParser

os.environ["OPENAI_API_KEY"] = os.getenv("OPENAI_API_KEY", "")

prompt = ChatPromptTemplate.from_messages([
    ("system", "You are a helpful assistant."),
    ("human", "Answer this in one sentence: {question}")
])

llm = ChatOpenAI(model="gpt-4o-mini", temperature=0)
chain = prompt | llm | StrOutputParser()

•Add a simple audit logger that writes JSON lines to disk. JSONL is easy to grep, ship to log tools, and ingest later for compliance review.

import json
from datetime import datetime, timezone

AUDIT_FILE = "audit.log"

def audit(event_type: str, payload: dict) -> None:
    record = {
        "ts": datetime.now(timezone.utc).isoformat(),
        "event_type": event_type,
        "payload": payload,
    }
    with open(AUDIT_FILE, "a", encoding="utf-8") as f:
        f.write(json.dumps(record, ensure_ascii=False) + "\n")

•Wrap your chain call so you can log the input before execution and the output after execution. This is the point where most teams stop at final answers only; don’t do that if you need traceability.

def run_with_audit(question: str) -> str:
    audit("chain_input", {"question": question})

    result = chain.invoke({"question": question})

    audit("chain_output", {
        "question": question,
        "result": result,
    })
    return result


if __name__ == "__main__":
    answer = run_with_audit("What is an audit log?")
    print(answer)

•If you want intermediate visibility inside your own code path, split the work into steps and log each one. This is useful when your chain includes preprocessing, retrieval, policy checks, or post-processing.

from langchain_core.runnables import RunnableLambda

def normalize_question(inputs: dict) -> dict:
    question = inputs["question"].strip()
    audit("normalize_question", {"original": inputs["question"], "normalized": question})
    return {"question": question}

def postprocess_answer(text: str) -> str:
    cleaned = text.strip()
    audit("postprocess_answer", {"raw": text, "cleaned": cleaned})
    return cleaned

pipeline = (
    RunnableLambda(normalize_question)
    | prompt
    | llm
    | StrOutputParser()
    | RunnableLambda(postprocess_answer)
)

print(pipeline.invoke({"question": "   What does LangChain do?   "}))

•For tool-based agents, log tool inputs and outputs explicitly. This matters because most production risk comes from tool use, not just model text generation.

from langchain_core.tools import tool

@tool
def lookup_policy(policy_id: str) -> str:
    result = f"Policy {policy_id}: active"
    audit("tool_call", {"tool": "lookup_policy", "input": policy_id})
    audit("tool_result", {"tool": "lookup_policy", "output": result})
    return result

print(lookup_policy.invoke("POL-12345"))

Testing It

Run the script once and check that audit.log contains one JSON object per line. You should see separate events for input capture, output capture, and any intermediate steps or tool calls you added.

A good test is to intentionally break something, like passing an empty string or forcing a bad API key setup, and confirm that your application logs the failure path too. In production systems, errors are part of the audit trail.

If you want stronger verification, parse the log file in a test and assert that required fields exist: timestamp, event type, and payload. That gives you a cheap regression check whenever someone changes the chain structure.

Next Steps

•Add request IDs so every log line can be correlated across services.
•Move from file logging to structured logging with logging or OpenTelemetry.
•Use LangChain callbacks if you want deeper tracing across model calls, retrievers, and tools without hand-wrapping every function.

Keep learning

•The complete AI Agents Roadmap — my full 8-step breakdown
•Free: The AI Agent Starter Kit — PDF checklist + starter code
•Work with me — I build AI for banks and insurance companies

By Cyprian Aarons, AI Consultant at Topiax.

ShareX / Twitter LinkedIn

Want the complete 8-step roadmap?

Grab the free AI Agent Starter Kit — architecture templates, compliance checklists, and a 7-email deep-dive course.

Get the Starter Kit