AutoGen Tutorial (Python): adding audit logs for advanced developers

By Cyprian AaronsUpdated 2026-04-21
autogenadding-audit-logs-for-advanced-developerspython

This tutorial shows you how to add audit logging to an AutoGen Python agent workflow so every message, tool call, and final response is traceable. You need this when you’re building systems for regulated environments like banking or insurance, where you need evidence of who said what, when, and why.

What You'll Need

  • Python 3.10+
  • pyautogen installed
  • An OpenAI API key
  • Basic familiarity with AutoGen agents and GroupChat
  • A writable local directory for log files
  • Optional: a JSON viewer or log aggregator if you want to inspect logs at scale

Step-by-Step

  1. Start by installing the package and setting your API key. Keep the key in an environment variable; don’t hardcode it into your script.
pip install pyautogen
export OPENAI_API_KEY="your-api-key"
  1. Create a logger that writes structured audit events to disk. Use JSON lines so each event is one record, which makes downstream ingestion into SIEMs or ELK stacks straightforward.
import json
import logging
from datetime import datetime, timezone

audit_logger = logging.getLogger("autogen_audit")
audit_logger.setLevel(logging.INFO)

handler = logging.FileHandler("audit.log")
handler.setFormatter(logging.Formatter("%(message)s"))
audit_logger.addHandler(handler)

def audit_event(event_type: str, **data):
    record = {
        "ts": datetime.now(timezone.utc).isoformat(),
        "event_type": event_type,
        **data,
    }
    audit_logger.info(json.dumps(record, ensure_ascii=False))
  1. Build a custom agent that emits audit records whenever it sends or receives messages. This is the cleanest place to capture message-level traceability without changing your business logic everywhere else.
import autogen

class AuditedAssistantAgent(autogen.AssistantAgent):
    def send(self, message, recipient, request_reply=True, silent=False):
        audit_event(
            "agent_send",
            sender=self.name,
            recipient=getattr(recipient, "name", str(recipient)),
            message=message if isinstance(message, str) else str(message),
        )
        return super().send(message, recipient, request_reply=request_reply, silent=silent)

    def receive(self, message, sender, request_reply=None, silent=False):
        audit_event(
            "agent_receive",
            recipient=self.name,
            sender=getattr(sender, "name", str(sender)),
            message=message if isinstance(message, str) else str(message),
        )
        return super().receive(message, sender, request_reply=request_reply, silent=silent)
  1. Add tool-call auditing by wrapping your function with a logger before registering it. In production this is where you capture external side effects like policy lookups or claims database reads.
def audited_tool(fn):
    def wrapper(*args, **kwargs):
        audit_event("tool_call_start", tool_name=fn.__name__, args=str(args), kwargs=str(kwargs))
        result = fn(*args, **kwargs)
        audit_event("tool_call_end", tool_name=fn.__name__, result=str(result))
        return result
    return wrapper

@audited_tool
def lookup_policy(policy_id: str) -> dict:
    return {"policy_id": policy_id, "status": "active", "tier": "gold"}
  1. Wire everything into a minimal AutoGen chat and run it. This example uses a single assistant plus user proxy so you can verify the log trail end-to-end before expanding to multi-agent workflows.
from autogen import UserProxyAgent

llm_config = {
    "model": "gpt-4o-mini",
    "api_key": os.environ["OPENAI_API_KEY"],
}

assistant = AuditedAssistantAgent(
    name="audited_assistant",
    llm_config=llm_config,
)

user_proxy = UserProxyAgent(
    name="user_proxy",
    human_input_mode="NEVER",
    code_execution_config=False,
)

audit_event("session_start", session_id="demo-001")

user_proxy.initiate_chat(
    assistant,
    message="Call lookup_policy('P12345') and summarize the status.",
)
  1. If you’re using GroupChat, log the orchestration layer too. That gives you visibility into speaker selection and multi-agent handoffs, which matters when a compliance reviewer asks why one agent took over from another.
from autogen import GroupChat, GroupChatManager

groupchat = GroupChat(
    agents=[assistant, user_proxy],
    messages=[],
    max_round=3,
)

manager = GroupChatManager(
    groupchat=groupchat,
    llm_config=llm_config,
)

audit_event("groupchat_start", agents=[a.name for a in groupchat.agents])

user_proxy.initiate_chat(manager, message="Use the policy tool and report the result.")

Testing It

Run the script once and inspect audit.log. You should see JSON records for session start, agent sends and receives, tool invocation start/end, and any group chat events you added.

If you don’t see tool events, confirm the function is actually being called by the model or invoked directly in your test flow. If you don’t see agent events, make sure your custom class is being used instead of the stock AssistantAgent.

For a quick validation pass:

  • Check that timestamps are UTC ISO-8601 strings.
  • Verify each record has a stable event_type.
  • Confirm sensitive data is either masked or excluded before writing logs.
  • Load the file with jq or Python’s json module to ensure every line parses cleanly.

Next Steps

  • Add redaction for PII fields before writing audit records.
  • Send the same JSON events to OpenTelemetry or your SIEM instead of only local disk.
  • Extend this pattern to RetrieveAssistantAgent and multi-tool workflows so retrieval traces are also auditable.

Keep learning

By Cyprian Aarons, AI Consultant at Topiax.

Want the complete 8-step roadmap?

Grab the free AI Agent Starter Kit — architecture templates, compliance checklists, and a 7-email deep-dive course.

Get the Starter Kit

Related Guides