How to Build a compliance checking Agent Using AutoGen in Python for banking
A compliance checking agent for banking reviews customer communications, product changes, KYC artifacts, or transaction narratives against policy and regulatory rules before anything goes live. It matters because one missed disclosure, one prohibited phrase, or one unreviewed exception can become a regulatory finding, a customer harm issue, or an audit problem.
Architecture
- •
Policy loader
- •Pulls bank-specific controls from approved sources: policy docs, control matrices, and legal rulebooks.
- •Keep this outside the model prompt so policy updates do not require code changes.
- •
Compliance reviewer agent
- •Uses
autogen.AssistantAgentto inspect the content and map it to controls. - •Produces structured findings: pass/fail, risk level, violated rule, and remediation.
- •Uses
- •
Evidence collector
- •Uses
autogen.UserProxyAgentto submit the artifact being checked and to execute any local validation functions. - •Stores input/output pairs for audit trails.
- •Uses
- •
Escalation path
- •Routes ambiguous cases to a human compliance officer.
- •This is not optional in banking; the agent should recommend escalation when confidence is low or rules conflict.
- •
Audit logger
- •Persists every decision with timestamps, policy version, model name, and reviewer output.
- •Required for traceability during internal audit and regulator review.
- •
Data protection layer
- •Redacts PII before sending text to the model when possible.
- •Enforces residency constraints by keeping processing in approved regions or on-prem infrastructure.
Implementation
1) Set up AutoGen agents for review and execution
Use one assistant agent to perform compliance analysis and one user proxy agent to run local checks. In banking, keep the assistant constrained to analysis only; do not let it call arbitrary tools.
import os
import json
from autogen import AssistantAgent, UserProxyAgent
llm_config = {
"model": "gpt-4o-mini",
"api_key": os.environ["OPENAI_API_KEY"],
"temperature": 0,
}
compliance_agent = AssistantAgent(
name="compliance_reviewer",
llm_config=llm_config,
system_message=(
"You are a banking compliance reviewer. "
"Check content against AML, KYC, disclosure, privacy, and marketing rules. "
"Return only valid JSON with keys: status, risk_level, issues, remediation."
),
)
executor = UserProxyAgent(
name="policy_executor",
human_input_mode="NEVER",
code_execution_config=False,
)
2) Define bank-specific policy checks as deterministic rules
Do not rely on the LLM alone for hard controls. Use deterministic checks for things like prohibited phrases, missing disclosures, or PII leakage.
PROHIBITED_PHRASES = [
"guaranteed approval",
"no credit check required",
"instant loan for everyone",
]
REQUIRED_DISCLOSURES = [
"Terms and conditions apply",
"Subject to credit approval",
]
def local_policy_check(text: str) -> dict:
issues = []
lowered = text.lower()
for phrase in PROHIBITED_PHRASES:
if phrase in lowered:
issues.append({
"rule": "marketing_prohibited_claim",
"detail": f"Found prohibited phrase: {phrase}",
"severity": "high",
})
for disclosure in REQUIRED_DISCLOSURES:
if disclosure.lower() not in lowered:
issues.append({
"rule": "missing_required_disclosure",
"detail": f"Missing disclosure: {disclosure}",
"severity": "medium",
})
status = "fail" if any(i["severity"] == "high" for i in issues) else ("review" if issues else "pass")
return {"status": status, "issues": issues}
3) Send the artifact to AutoGen and combine model output with local controls
The pattern below uses AssistantAgent.generate_reply() through a standard chat exchange. The model performs contextual review; your deterministic layer enforces bank policy.
def run_compliance_review(document_text: str) -> dict:
local_result = local_policy_check(document_text)
prompt = f"""
Review this banking content for compliance risks.
Document:
{document_text}
Local rule result:
{json.dumps(local_result)}
Return JSON only with:
status: pass|review|fail
risk_level: low|medium|high
issues: array of objects with rule, detail, severity
remediation: array of concrete fixes
"""
messages = [{"role": "user", "content": prompt}]
reply = compliance_agent.generate_reply(messages=messages)
try:
model_result = json.loads(reply if isinstance(reply, str) else reply["content"])
except Exception:
model_result = {
"status": "review",
"risk_level": "medium",
"issues": [{"rule": "invalid_model_output", "detail": str(reply), "severity": "medium"}],
"remediation": ["Require structured JSON output from the reviewer."],
}
combined_issues = local_result["issues"] + model_result.get("issues", [])
final_status = (
"fail"
if any(i["severity"] == "high" for i in combined_issues)
else ("review" if combined_issues else "pass")
)
return {
"status": final_status,
"local_result": local_result,
"model_result": model_result,
"combined_issues": combined_issues,
}
4) Persist audit evidence for traceability
Every review needs an immutable record. At minimum store input hash, policy version, timestamp, result status, and reviewer output.
from datetime import datetime
from hashlib import sha256
def audit_record(document_text: str, result: dict) -> dict:
return {
"timestamp_utc": datetime.utcnow().isoformat() + "Z",
"document_hash": sha256(document_text.encode("utf-8")).hexdigest(),
"policy_version": os.getenv("POLICY_VERSION", "2026.01"),
"model_name": llm_config["model"],
"result": result,
}
sample_doc = """
This card offer includes guaranteed approval and instant loan for everyone.
"""
result = run_compliance_review(sample_doc)
record = audit_record(sample_doc, result)
print(json.dumps(record, indent=2))
Production Considerations
- •
Keep sensitive data out of prompts
- •Redact account numbers, SSNs/NINs, addresses, and transaction identifiers before review.
- •If you need full-fidelity inspection, run inside your bank’s controlled environment with approved retention rules.
- •
Version policies like code
- •Store control mappings in Git and tag every decision with a policy version.
- •When legal updates a rule set, you want reproducible historical decisions during audits.
- •
Add human escalation thresholds
- •Route anything with conflicting signals or low-confidence outputs to a compliance analyst.
- •For regulated content approvals, “review” should mean “blocked until signed off,” not “best effort.”
- •
Log everything needed for audit
- •Capture request ID, reviewer output, prompt template version, model version، and redaction state.
- •Make logs tamper-evident and align retention with your bank’s recordkeeping schedule.
Common Pitfalls
- •
Letting the LLM be the source of truth
- •Mistake: using the model alone to decide pass/fail.
- •Fix: use deterministic checks for hard rules and reserve the model for contextual judgment.
- •
Skipping data residency constraints
- •Mistake: sending customer data to an unapproved region or public endpoint.
- •Fix: route traffic through approved infrastructure and verify vendor deployment options before production.
- •
No structured output contract
- •Mistake: accepting free-form prose from the agent.
- •Fix: require JSON output with strict keys and validate it before downstream use.
A banking compliance agent is useful only when it is boringly predictable. If you can’t explain why it passed or failed a document three months later in front of audit or regulators, it is not production-ready yet.
Keep learning
- •The complete AI Agents Roadmap — my full 8-step breakdown
- •Free: The AI Agent Starter Kit — PDF checklist + starter code
- •Work with me — I build AI for banks and insurance companies
By Cyprian Aarons, AI Consultant at Topiax.
Want the complete 8-step roadmap?
Grab the free AI Agent Starter Kit — architecture templates, compliance checklists, and a 7-email deep-dive course.
Get the Starter Kit