How to Build a compliance checking Agent Using AutoGen in Python for pension funds

By Cyprian AaronsUpdated 2026-04-21

compliance-checkingautogenpythonpension-funds

A compliance checking agent for pension funds reviews contributions, withdrawals, member communications, investment actions, and policy changes against regulatory and internal rules before anything is approved. It matters because pension operations have low tolerance for errors: a bad rule interpretation can trigger member harm, regulatory findings, and expensive remediation.

Architecture

•
User-facing request layer
- •Accepts compliance checks from ops teams, case managers, or workflow systems.
- •Normalizes inputs like transaction type, jurisdiction, member status, and effective dates.
•
Policy retrieval layer
- •Pulls the relevant pension scheme rules, trustee policies, and jurisdiction-specific regulations.
- •Keeps the agent grounded in current documents instead of relying on model memory.
•
AutoGen agent group
- •Uses a UserProxyAgent to orchestrate execution.
- •Uses one or more AssistantAgent instances for policy analysis, exception handling, and final decisioning.
- •Optionally adds a reviewer agent for second-pass validation on high-risk cases.
•
Decision engine
- •Converts the model’s reasoning into structured outputs: compliant, needs_review, or non_compliant.
- •Forces citations back to source policy text for auditability.
•
Audit and evidence store
- •Persists prompts, retrieved policy snippets, model outputs, timestamps, and operator overrides.
- •Required for pension fund audit trails and regulatory review.
•
Guardrail layer
- •Blocks unsupported actions like direct approval of benefit payments without human sign-off.
- •Enforces data residency and PII handling rules before any external calls.

Implementation

1. Install AutoGen and define the compliance task

For this pattern, use AutoGen’s agent chat primitives directly. The example below creates a small multi-agent setup that checks whether a pension fund action violates policy.

from autogen import AssistantAgent, UserProxyAgent
from typing import Dict

llm_config = {
    "model": "gpt-4o-mini",
    "api_key": "YOUR_OPENAI_API_KEY",
    "temperature": 0,
}

def compliance_checker(message: str) -> str:
    return message

user_proxy = UserProxyAgent(
    name="user_proxy",
    human_input_mode="NEVER",
    max_consecutive_auto_reply=1,
    code_execution_config=False,
)

policy_analyst = AssistantAgent(
    name="policy_analyst",
    llm_config=llm_config,
    system_message=(
        "You are a pension fund compliance analyst. "
        "Assess requests against provided policy text only. "
        "Return concise findings with citations."
    ),
)

reviewer = AssistantAgent(
    name="reviewer",
    llm_config=llm_config,
    system_message=(
        "You are a senior compliance reviewer. "
        "Check the first analyst's conclusion for missed risks. "
        "Return only corrections or confirm the result."
    ),
)

request = """
Check this action:
- Action: transfer member benefits to external bank account
- Jurisdiction: South Africa
- Member age: 58
- Retirement status: not retired
- Policy excerpt: Benefits may only be paid out on retirement, death, disability, or other approved exit events.
"""

chat_result = user_proxy.initiate_chat(
    policy_analyst,
    message=request,
)
print(chat_result.summary)

The important part here is that AssistantAgent does the reasoning and UserProxyAgent controls the flow. In production you usually wrap this in your own service layer so the agent never sees raw requests without validation.

2. Add structured output so downstream systems can act on it

Free-form prose is hard to audit. Force the assistant to emit JSON-like fields that your workflow engine can parse deterministically.

from autogen import AssistantAgent, UserProxyAgent

decision_agent = AssistantAgent(
    name="decision_agent",
    llm_config=llm_config,
    system_message=(
        "You are a pension fund compliance decision engine. "
        "Output exactly in this format:\n"
        "{"
        '"status": "compliant|needs_review|non_compliant", '
        '"reason": "...", '
        '"citations": ["..."], '
        '"risk_level": "low|medium|high"'
        "}.\n"
        "Use only the supplied policy text."
    ),
)

case_text = """
Request: approve early withdrawal due to financial hardship.
Policy: Early withdrawals are prohibited unless explicitly permitted by scheme rules and local regulation.
"""

result = user_proxy.initiate_chat(decision_agent, message=case_text)
print(result.summary)

In practice you should validate the returned structure with Pydantic or JSON schema before saving it. If the model returns malformed output, mark the case needs_review rather than trying to repair it silently.

3. Chain a second-pass reviewer for high-risk cases

Pension funds need stronger controls around benefit payments, transfers, contribution holidays, and exceptions. A second agent helps catch missed constraints before an answer reaches operations.

analysis_prompt = """
Case:
- Action: approve lump-sum benefit payment
- Member status: active
- Age: 54
- Policy excerpt: Lump-sum benefits require retirement eligibility and trustee approval where applicable.

First pass conclusion:
The request appears non-compliant because the member is not retired.
"""

first_pass = user_proxy.initiate_chat(policy_analyst, message=analysis_prompt)

review_prompt = f"""
Review this compliance assessment for accuracy and missing risks.
Original case:
{analysis_prompt}

First pass summary:
{first_pass.summary}
"""

second_pass = user_proxy.initiate_chat(reviewer, message=review_prompt)
print(second_pass.summary)

This pattern is useful when you want an explicit escalation path. If the reviewer disagrees with the first pass, route the case to a human compliance officer with both outputs attached.

Production Considerations

•
Keep policy data local where residency matters
- •Pension data often includes national ID numbers, benefit amounts, medical information, and beneficiary details.
- •Store sensitive documents in-region and send only minimized excerpts to the model.
•
Log every decision with evidence
- •Persist input payloads, retrieved policy passages, agent outputs, timestamps, model version, and operator overrides.
- •Auditors will ask why a case was marked compliant; “the model said so” is not acceptable.
•
Add deterministic guardrails before agent execution
- •Reject missing jurisdiction fields, invalid dates, or unsupported transaction types before calling AutoGen.
- •Never let the agent approve payments directly; keep approval as a human-controlled workflow step.
•
Monitor drift in both policy and model behavior
- •Pension rules change after board resolutions, regulator updates, or scheme amendments.
- •Re-run a golden test set weekly against known edge cases like early access requests, deceased-member claims, and cross-border transfers.

Common Pitfalls

•
Using the model without grounding it in current policy
- •This causes hallucinated compliance answers that look confident but are wrong.
- •Fix it by passing only retrieved policy excerpts into each chat turn and rejecting answers without citations.
•
Treating free-text output as machine-ready decisions
- •Downstream systems break when responses vary in format.
- •Fix it by enforcing structured output and validating it before routing cases onward.
•
Ignoring pension-specific controls like auditability and residency
- •Generic agent setups often ship prompts to unmanaged endpoints or lose traceability.
- •Fix it by keeping sensitive data localized, logging all interactions immutably if possible, and requiring human review on high-risk actions like benefit disbursements or exception approvals.

Keep learning

•The complete AI Agents Roadmap — my full 8-step breakdown
•Free: The AI Agent Starter Kit — PDF checklist + starter code
•Work with me — I build AI for banks and insurance companies

By Cyprian Aarons, AI Consultant at Topiax.

ShareX / Twitter LinkedIn

Want the complete 8-step roadmap?

Grab the free AI Agent Starter Kit — architecture templates, compliance checklists, and a 7-email deep-dive course.

Get the Starter Kit