How to Build a compliance checking Agent Using CrewAI in Python for investment banking

By Cyprian AaronsUpdated 2026-04-21

compliance-checkingcrewaipythoninvestment-banking

A compliance checking agent in investment banking reviews proposed communications, trade-related content, and client-facing documents against policy before they leave the firm. It matters because a single bad email, unsuitable recommendation, or restricted-language violation can create regulatory exposure, audit findings, and real financial penalties.

Architecture

•
Policy retrieval layer
- •Pulls the latest internal policies, restricted lists, and jurisdiction-specific rules.
- •Keeps compliance checks aligned with current bank policy instead of hardcoded prompts.
•
Compliance analysis agent
- •Uses CrewAI to inspect text for violations like misleading claims, MNPI risk, unapproved product language, and missing disclosures.
- •Produces structured findings, not just free-form commentary.
•
Evidence and audit logger
- •Stores input text, policy version, agent output, timestamp, and reviewer decisions.
- •Supports audit trails for model governance and regulator requests.
•
Human escalation path
- •Routes high-risk cases to a compliance officer or legal reviewer.
- •Prevents the agent from making final approval decisions on sensitive content.
•
Output formatter
- •Converts findings into a machine-readable JSON payload for downstream workflow systems.
- •Makes it easy to integrate with ticketing, DMS, or pre-trade approval tools.

Implementation

•

Install CrewAI and define the compliance task

Start by creating one agent that behaves like a compliance reviewer and one task that asks for structured output. In investment banking, you want the model to classify risk clearly: approve, reject, or escalate.

from crewai import Agent, Task, Crew
from crewai.tools import tool
from pydantic import BaseModel
import json

class ComplianceResult(BaseModel):
    decision: str
    risk_level: str
    violations: list[str]
    rationale: str

@tool("policy_lookup")
def policy_lookup(topic: str) -> str:
    policies = {
        "marketing": "No performance guarantees. Past performance disclaimers required.",
        "mnpi": "Do not disclose material non-public information.",
        "restricted": "Do not mention restricted securities or prohibited counterparties."
    }
    return policies.get(topic.lower(), "No policy found.")

compliance_agent = Agent(
    role="Investment Banking Compliance Reviewer",
    goal="Review content for regulatory and internal policy violations",
    backstory=(
        "You review banker communications and client-facing materials for "
        "compliance risks including MNPI, misleading statements, restricted securities, "
        "and missing disclosures."
    ),
    tools=[policy_lookup],
    verbose=True,
    allow_delegation=False,
)

compliance_task = Task(
    description=(
        "Review the following draft email for compliance issues. "
        "Return a concise assessment with decision, risk_level, violations, and rationale.\n\n"
        "Draft:\n"
        "{draft_text}"
    ),
    expected_output="Structured compliance assessment with clear decision.",
    agent=compliance_agent,
)

•

Run the crew and force an operationally useful response

The key pattern is to keep the task narrow. Don’t ask the model to be a lawyer; ask it to identify issues against named policies and produce a review result your workflow can route.

def review_draft(draft_text: str) -> str:
    crew = Crew(
        agents=[compliance_agent],
        tasks=[compliance_task],
        verbose=True,
    )
    result = crew.kickoff(inputs={"draft_text": draft_text})
    return str(result)

if __name__ == "__main__":
    draft = (
        "Hi client team,\n\n"
        "Our desk believes this structured note will outperform the market by 20% "
        "and there is no downside risk. Please share with your clients today.\n"
    )

    output = review_draft(draft)
    print(output)

•

Wrap the agent with deterministic post-processing

In production, you should not trust raw text alone. Parse the output into a schema so downstream systems can route approvals consistently.

def normalize_result(raw_output: str) -> dict:
    # If your prompt is stable enough, you can parse JSON here.
    # For stronger guarantees, instruct the agent to emit strict JSON.
    try:
        return json.loads(raw_output)
    except json.JSONDecodeError:
        return {
            "decision": "escalate",
            "risk_level": "high",
            "violations": ["unstructured_output"],
            "rationale": raw_output,
        }

result_text = review_draft("Please tell clients this trade is guaranteed profit.")
normalized = normalize_result(result_text)
print(normalized)

•

Add escalation logic for bank-grade controls

A compliance agent should never auto-approve high-risk content without thresholds. Use simple business rules around keywords or confidence signals to force human review when needed.

HIGH_RISK_TRIGGERS = [
    "guaranteed profit",
    "no downside risk",
    "material non-public information",
    "inside information",
]

def requires_human_review(text: str) -> bool:
    lowered = text.lower()
    return any(trigger in lowered for trigger in HIGH_RISK_TRIGGERS)

draft = "This trade is guaranteed profit with no downside risk."
if requires_human_review(draft):
    print("Route to compliance officer")
else:
   print(review_draft(draft))

Production Considerations

•
Data residency
- •Keep client drafts, trade context, and policy documents in-region if your bank has jurisdictional constraints.
- •If you use hosted LLM endpoints, confirm where prompts and logs are stored.
•
Auditability
- •Persist every input, retrieved policy version, model response, and final human decision.
- •Regulators care about traceability more than clever prompting.
•
Guardrails
- •Use allowlisted tools only.
- •Block the agent from generating final legal opinions or approving trades without human sign-off.
•
Monitoring
- •Track false positives on benign banker communications and false negatives on risky language.
- •Re-run evaluation sets when policies change or models are upgraded.

Common Pitfalls

•
Treating the agent as an approver
- •The agent should flag risk; it should not replace compliance sign-off.
- •Fix this by wiring an explicit escalation step for medium/high-risk cases.
•
Using vague prompts without policy anchors
- •“Check this email” is too broad for investment banking.
- •Fix this by referencing specific rules like MNPI handling, restricted lists, marketing claims, and disclosure requirements.
•
Skipping structured outputs
- •Free-form summaries are hard to automate and impossible to audit at scale.
- •Fix this by enforcing a schema such as decision, risk_level, violations, and rationale.
•
Ignoring jurisdiction differences
- •A rule that passes in one region may fail under another regulator’s framework.
- •Fix this by passing jurisdiction metadata into the task and retrieving region-specific policy text before review.

Keep learning

•The complete AI Agents Roadmap — my full 8-step breakdown
•Free: The AI Agent Starter Kit — PDF checklist + starter code
•Work with me — I build AI for banks and insurance companies

By Cyprian Aarons, AI Consultant at Topiax.

ShareX / Twitter LinkedIn

Want the complete 8-step roadmap?

Grab the free AI Agent Starter Kit — architecture templates, compliance checklists, and a 7-email deep-dive course.

Get the Starter Kit