CrewAI Tutorial (Python): adding audit logs for intermediate developers

By Cyprian AaronsUpdated 2026-04-21
crewaiadding-audit-logs-for-intermediate-developerspython

This tutorial shows you how to add audit logging to a CrewAI Python app so every task, tool call, and agent decision leaves a trace. You need this when your agent is handling regulated workflows, where you must answer basic questions like who did what, when, and with which input.

What You'll Need

  • Python 3.10 or newer
  • crewai
  • openai
  • An OpenAI API key in OPENAI_API_KEY
  • A basic CrewAI project with at least one agent and one task
  • Optional but useful:
    • python-dotenv for local .env loading
    • A log destination such as stdout, file, or SQLite

Step-by-Step

  1. Start by installing the packages and setting up your environment. For this example, I’m using standard logging plus CrewAI’s normal agent/task setup, so there’s no custom framework dependency.
pip install crewai openai python-dotenv
export OPENAI_API_KEY="your-key-here"
  1. Create a small audit logger that writes structured JSON lines. This keeps logs machine-readable and makes it easy to ship them into ELK, Datadog, or a SIEM later.
import json
import logging
from datetime import datetime, timezone

logger = logging.getLogger("audit")
logger.setLevel(logging.INFO)

handler = logging.StreamHandler()
handler.setFormatter(logging.Formatter("%(message)s"))
logger.addHandler(handler)

def audit_event(event_type: str, **payload):
    record = {
        "ts": datetime.now(timezone.utc).isoformat(),
        "event_type": event_type,
        **payload,
    }
    logger.info(json.dumps(record))
  1. Build your CrewAI objects and wrap the execution points with audit events. The clean pattern here is to log before kickoff, after kickoff, and around any tool-like function you control.
from crewai import Agent, Task, Crew, Process
from crewai.llm import LLM

llm = LLM(model="gpt-4o-mini")

researcher = Agent(
    role="Researcher",
    goal="Summarize customer policy changes accurately",
    backstory="You work in a regulated insurance operations team.",
    llm=llm,
    verbose=True,
)

task = Task(
    description="Summarize the latest policy change memo in 3 bullet points.",
    expected_output="A concise summary with no fabrication.",
    agent=researcher,
)

crew = Crew(
    agents=[researcher],
    tasks=[task],
    process=Process.sequential,
    verbose=True,
)
  1. Add an execution wrapper that records the run metadata and final output. This is the part most teams skip; without it you have logs from the model but no durable record of the business action that was requested.
import uuid

def run_with_audit():
    run_id = str(uuid.uuid4())
    audit_event(
        "crew_kickoff_started",
        run_id=run_id,
        crew_name="policy_summary_crew",
        task_count=len(crew.tasks),
    )

    result = crew.kickoff()

    audit_event(
        "crew_kickoff_completed",
        run_id=run_id,
        crew_name="policy_summary_crew",
        result=str(result),
    )
    return result

if __name__ == "__main__":
    output = run_with_audit()
    print(output)
  1. If you have tools or helper functions, instrument them too. In production systems, the highest-value audit trail is usually around external side effects: database reads, policy lookups, document generation, or API calls.
def lookup_policy(policy_id: str) -> str:
    audit_event("tool_call_started", tool_name="lookup_policy", policy_id=policy_id)

    # Replace this with a real DB/API call.
    result = f"Policy {policy_id}: active coverage with renewal pending."

    audit_event(
        "tool_call_completed",
        tool_name="lookup_policy",
        policy_id=policy_id,
        result=result,
    )
    return result
  1. Connect the tool into your agent workflow by using its output in the task context. If you need stronger traceability later, pass a run_id through every helper so all events can be correlated end-to-end.
policy_text = lookup_policy("POL-1042")

task_with_context = Task(
    description=f"Summarize this policy status for an operations analyst: {policy_text}",
    expected_output="A short operational summary.",
    agent=researcher,
)

crew_with_context = Crew(
    agents=[researcher],
    tasks=[task_with_context],
    process=Process.sequential,
)

if __name__ == "__main__":
    audit_event("second_run_started", crew_name="policy_summary_crew")
    print(crew_with_context.kickoff())

Testing It

Run the script locally and confirm that every major action prints a JSON log line before and after execution. You should see at least one crew_kickoff_started, one tool_call_started, one tool_call_completed, and one crew_kickoff_completed event.

Then inspect whether the timestamps are UTC ISO-8601 strings and whether each event includes enough context to correlate actions later. If you want better operational visibility, pipe stdout into a file and verify that each line is valid JSON.

Finally, intentionally break something like the API key or a prompt input and confirm the failure still leaves an audit trail up to the point of error. That’s the difference between “we logged something” and “we can actually investigate incidents.”

Next Steps

  • Add correlation IDs to every request coming from your web app or job runner.
  • Send these JSON logs to OpenSearch, Datadog, or CloudWatch instead of stdout.
  • Extend the pattern to log prompt hashes and redacted inputs for compliance reviews.

Keep learning

By Cyprian Aarons, AI Consultant at Topiax.

Want the complete 8-step roadmap?

Grab the free AI Agent Starter Kit — architecture templates, compliance checklists, and a 7-email deep-dive course.

Get the Starter Kit

Related Guides