How to Build a compliance checking Agent Using CrewAI in Python for banking

By Cyprian AaronsUpdated 2026-04-21
compliance-checkingcrewaipythonbanking

A compliance checking agent reviews banking documents, customer communications, or transaction narratives against policy rules and flags potential breaches before they hit production or an audit queue. In banking, that matters because compliance failures create regulatory exposure, operational risk, and expensive manual review cycles.

Architecture

A production compliance agent for banking needs a small but strict set of components:

  • Policy knowledge source

    • Internal AML/KYC rules, sanctions policy, product suitability rules, and jurisdiction-specific controls.
    • Keep this in versioned documents or a controlled retrieval layer.
  • Document intake layer

    • Accepts structured inputs like JSON transaction records, customer emails, chat transcripts, or loan application notes.
    • Normalizes inputs before the agent sees them.
  • Compliance analyst agent

    • Uses crewai.Agent to inspect the input against policy.
    • Produces a decision: pass, flag for review, or reject.
  • Task definition

    • Uses crewai.Task to define the exact compliance checks and output format.
    • This is where you force traceable, auditable results.
  • Crew orchestration

    • Uses crewai.Crew to run the workflow deterministically.
    • For banking, keep the flow simple: one analyst agent plus optional reviewer agent.
  • Audit output store

    • Persists decision, rationale, timestamps, model version, and source references.
    • This is non-negotiable for regulators and internal audit.

Implementation

1) Install CrewAI and define your compliance input schema

Start with a strict data contract. Banking workflows fail when free-form text leaks into the decision path.

from pydantic import BaseModel, Field
from typing import List

class ComplianceCase(BaseModel):
    case_id: str
    customer_id: str
    jurisdiction: str
    channel: str
    narrative: str
    flags: List[str] = Field(default_factory=list)

This keeps your agent focused on a bounded payload. You can extend it later with transaction amount, product type, or sanctions screening hits.

2) Create the compliance agent and task

Use a constrained role description. The model should not improvise policy; it should apply the rules you provide.

from crewai import Agent, Task

compliance_agent = Agent(
    role="Banking Compliance Analyst",
    goal="Review cases for AML, KYC, sanctions, suitability, and policy violations.",
    backstory=(
        "You are a senior banking compliance analyst. "
        "You only use provided policy context and case facts. "
        "You return concise decisions suitable for audit."
    ),
    verbose=True,
    allow_delegation=False,
)

compliance_task = Task(
    description=(
        "Assess the following banking case for compliance risk.\n"
        "Return one of: PASS, REVIEW, REJECT.\n"
        "Include specific rule references and a short audit-ready rationale.\n\n"
        "Case:\n"
        "{case}"
    ),
    expected_output=(
        "A structured assessment with decision, rationale, risk level, "
        "and any violated policy areas."
    ),
    agent=compliance_agent,
)

The key pattern here is that the task tells the agent exactly what output shape you want. That reduces vague responses and makes downstream automation easier.

3) Run the crew and capture an auditable result

For a first production pattern, keep one agent and one task. Add more agents only when there is a clear separation of duties.

from crewai import Crew, Process

crew = Crew(
    agents=[compliance_agent],
    tasks=[compliance_task],
    process=Process.sequential,
    verbose=True,
)

case = ComplianceCase(
    case_id="CASE-10492",
    customer_id="CUST-7712",
    jurisdiction="UK",
    channel="online_banking",
    narrative=(
        "Customer requested multiple same-day transfers to newly added beneficiaries "
        "with no prior relationship history."
    ),
    flags=["rapid_transfers", "new_beneficiaries"],
)

result = crew.kickoff(inputs={"case": case.model_dump_json()})
print(result)

In banking systems, do not send raw PII unless you have approval for that data path. If your deployment must stay in-region, run this inside your approved cloud region or on-prem environment and keep all logs local.

4) Add a second-step human review handoff

The agent should not be your final authority on high-risk cases. Use it as a triage layer that routes to humans when confidence is low or impact is high.

def route_case(output_text: str) -> str:
    if "REJECT" in output_text:
        return "manual_review_queue"
    if "REVIEW" in output_text:
        return "compliance_analyst_queue"
    return "auto_approve"

queue = route_case(str(result))
print({"case_id": case.case_id, "queue": queue})

This pattern is simple but effective. It keeps automation bounded and gives auditors a clear control point.

Production Considerations

  • Keep data residency explicit

    • Bank customer data often cannot leave a specific country or region.
    • Pin model endpoints, vector stores, logs, and backups to approved regions.
  • Log every decision path

    • Store input hash, output text, timestamp, model name/version, prompt version, and reviewer override.
    • Regulators care about reproducibility more than clever prompts.
  • Add guardrails before the LLM call

    • Redact account numbers, national IDs, card PANs, and sensitive free text where possible.
    • Validate schema strictly before invoking crew.kickoff().
  • Use deterministic escalation thresholds

    • High-value transactions or sanctions-related hits should bypass auto-pass logic.
    • Route borderline outcomes to human compliance staff by rule.

Common Pitfalls

  • Using the agent as an oracle

    • Mistake: letting the model decide final disposition on regulated cases.
    • Fix: make it a triage and summarization layer with human approval for risky outcomes.
  • Passing unstructured bank data directly into prompts

    • Mistake: dumping raw emails or transaction feeds into the task description.
    • Fix: normalize into a schema like ComplianceCase first and redact sensitive fields.
  • Skipping audit metadata

    • Mistake: storing only the final label.
    • Fix: persist full traces — prompt version, model version, rationale text, input hash, reviewer action — so internal audit can reconstruct the decision later.

Keep learning

By Cyprian Aarons, AI Consultant at Topiax.

Want the complete 8-step roadmap?

Grab the free AI Agent Starter Kit — architecture templates, compliance checklists, and a 7-email deep-dive course.

Get the Starter Kit

Related Guides