How to Build a compliance checking Agent Using CrewAI in Python for investment banking

By Cyprian AaronsUpdated 2026-04-21
compliance-checkingcrewaipythoninvestment-banking

A compliance checking agent in investment banking reviews proposed communications, trade-related content, and client-facing documents against policy before they leave the firm. It matters because a single bad email, unsuitable recommendation, or restricted-language violation can create regulatory exposure, audit findings, and real financial penalties.

Architecture

  • Policy retrieval layer

    • Pulls the latest internal policies, restricted lists, and jurisdiction-specific rules.
    • Keeps compliance checks aligned with current bank policy instead of hardcoded prompts.
  • Compliance analysis agent

    • Uses CrewAI to inspect text for violations like misleading claims, MNPI risk, unapproved product language, and missing disclosures.
    • Produces structured findings, not just free-form commentary.
  • Evidence and audit logger

    • Stores input text, policy version, agent output, timestamp, and reviewer decisions.
    • Supports audit trails for model governance and regulator requests.
  • Human escalation path

    • Routes high-risk cases to a compliance officer or legal reviewer.
    • Prevents the agent from making final approval decisions on sensitive content.
  • Output formatter

    • Converts findings into a machine-readable JSON payload for downstream workflow systems.
    • Makes it easy to integrate with ticketing, DMS, or pre-trade approval tools.

Implementation

  1. Install CrewAI and define the compliance task

    Start by creating one agent that behaves like a compliance reviewer and one task that asks for structured output. In investment banking, you want the model to classify risk clearly: approve, reject, or escalate.

    from crewai import Agent, Task, Crew
    from crewai.tools import tool
    from pydantic import BaseModel
    import json
    
    class ComplianceResult(BaseModel):
        decision: str
        risk_level: str
        violations: list[str]
        rationale: str
    
    @tool("policy_lookup")
    def policy_lookup(topic: str) -> str:
        policies = {
            "marketing": "No performance guarantees. Past performance disclaimers required.",
            "mnpi": "Do not disclose material non-public information.",
            "restricted": "Do not mention restricted securities or prohibited counterparties."
        }
        return policies.get(topic.lower(), "No policy found.")
    
    compliance_agent = Agent(
        role="Investment Banking Compliance Reviewer",
        goal="Review content for regulatory and internal policy violations",
        backstory=(
            "You review banker communications and client-facing materials for "
            "compliance risks including MNPI, misleading statements, restricted securities, "
            "and missing disclosures."
        ),
        tools=[policy_lookup],
        verbose=True,
        allow_delegation=False,
    )
    
    compliance_task = Task(
        description=(
            "Review the following draft email for compliance issues. "
            "Return a concise assessment with decision, risk_level, violations, and rationale.\n\n"
            "Draft:\n"
            "{draft_text}"
        ),
        expected_output="Structured compliance assessment with clear decision.",
        agent=compliance_agent,
    )
    
  2. Run the crew and force an operationally useful response

    The key pattern is to keep the task narrow. Don’t ask the model to be a lawyer; ask it to identify issues against named policies and produce a review result your workflow can route.

    def review_draft(draft_text: str) -> str:
        crew = Crew(
            agents=[compliance_agent],
            tasks=[compliance_task],
            verbose=True,
        )
        result = crew.kickoff(inputs={"draft_text": draft_text})
        return str(result)
    
    if __name__ == "__main__":
        draft = (
            "Hi client team,\n\n"
            "Our desk believes this structured note will outperform the market by 20% "
            "and there is no downside risk. Please share with your clients today.\n"
        )
    
        output = review_draft(draft)
        print(output)
    
  3. Wrap the agent with deterministic post-processing

    In production, you should not trust raw text alone. Parse the output into a schema so downstream systems can route approvals consistently.

    def normalize_result(raw_output: str) -> dict:
        # If your prompt is stable enough, you can parse JSON here.
        # For stronger guarantees, instruct the agent to emit strict JSON.
        try:
            return json.loads(raw_output)
        except json.JSONDecodeError:
            return {
                "decision": "escalate",
                "risk_level": "high",
                "violations": ["unstructured_output"],
                "rationale": raw_output,
            }
    
    result_text = review_draft("Please tell clients this trade is guaranteed profit.")
    normalized = normalize_result(result_text)
    print(normalized)
    
  4. Add escalation logic for bank-grade controls

    A compliance agent should never auto-approve high-risk content without thresholds. Use simple business rules around keywords or confidence signals to force human review when needed.

    HIGH_RISK_TRIGGERS = [
        "guaranteed profit",
        "no downside risk",
        "material non-public information",
        "inside information",
    ]
    
    def requires_human_review(text: str) -> bool:
        lowered = text.lower()
        return any(trigger in lowered for trigger in HIGH_RISK_TRIGGERS)
    
    draft = "This trade is guaranteed profit with no downside risk."
    if requires_human_review(draft):
        print("Route to compliance officer")
    else:
       print(review_draft(draft))
    

Production Considerations

  • Data residency

    • Keep client drafts, trade context, and policy documents in-region if your bank has jurisdictional constraints.
    • If you use hosted LLM endpoints, confirm where prompts and logs are stored.
  • Auditability

    • Persist every input, retrieved policy version, model response, and final human decision.
    • Regulators care about traceability more than clever prompting.
  • Guardrails

    • Use allowlisted tools only.
    • Block the agent from generating final legal opinions or approving trades without human sign-off.
  • Monitoring

    • Track false positives on benign banker communications and false negatives on risky language.
    • Re-run evaluation sets when policies change or models are upgraded.

Common Pitfalls

  • Treating the agent as an approver

    • The agent should flag risk; it should not replace compliance sign-off.
    • Fix this by wiring an explicit escalation step for medium/high-risk cases.
  • Using vague prompts without policy anchors

    • “Check this email” is too broad for investment banking.
    • Fix this by referencing specific rules like MNPI handling, restricted lists, marketing claims, and disclosure requirements.
  • Skipping structured outputs

    • Free-form summaries are hard to automate and impossible to audit at scale.
    • Fix this by enforcing a schema such as decision, risk_level, violations, and rationale.
  • Ignoring jurisdiction differences

    • A rule that passes in one region may fail under another regulator’s framework.
    • Fix this by passing jurisdiction metadata into the task and retrieving region-specific policy text before review.

Keep learning

By Cyprian Aarons, AI Consultant at Topiax.

Want the complete 8-step roadmap?

Grab the free AI Agent Starter Kit — architecture templates, compliance checklists, and a 7-email deep-dive course.

Get the Starter Kit

Related Guides