How to Build a fraud detection Agent Using CrewAI in Python for investment banking
A fraud detection agent in investment banking ingests transaction and client activity data, flags suspicious patterns, and routes high-risk cases to human investigators with enough evidence to act fast. It matters because false negatives cost money and regulatory exposure, while false positives burn analyst time and create friction in high-value client operations.
Architecture
- •
Data ingestion layer
- •Pulls transactions, account events, KYC attributes, device/session signals, and watchlist hits from internal systems.
- •Normalizes records into a consistent schema before the agent sees them.
- •
Detection analyst agent
- •Reviews a transaction or case summary.
- •Identifies suspicious patterns like structuring, rapid movement of funds, unusual counterparties, or jurisdiction mismatches.
- •
Compliance reviewer agent
- •Checks findings against policy rules, AML/KYC controls, sanctions constraints, and escalation thresholds.
- •Produces an audit-friendly rationale.
- •
Case summarizer agent
- •Converts raw signals into a concise investigation brief for analysts or operations teams.
- •Keeps the output structured so it can be stored in a case management system.
- •
Orchestrator
- •CrewAI
Crewcoordinates tasks across agents. - •A
Process.sequentialflow is enough for most first versions because you want deterministic review order.
- •CrewAI
- •
Evidence store
- •Persists inputs, model outputs, timestamps, and decision traces.
- •Required for auditability and post-incident review.
Implementation
1) Install CrewAI and define your task inputs
Use structured input. Fraud detection breaks quickly when you feed the model free-form blobs with no schema.
pip install crewai crewai-tools pydantic
Create a transaction payload that includes the fields compliance teams actually care about:
from pydantic import BaseModel
from typing import Optional
class TransactionInput(BaseModel):
transaction_id: str
customer_id: str
amount: float
currency: str
country: str
counterparty_country: str
channel: str
timestamp: str
kyc_risk_rating: str
prior_alert_count: int
notes: Optional[str] = None
2) Define agents with explicit responsibilities
Keep each agent narrow. In regulated environments, one giant “smart” agent is harder to audit and harder to constrain.
from crewai import Agent
fraud_analyst = Agent(
role="Fraud Detection Analyst",
goal="Detect suspicious investment banking transactions using provided signals",
backstory=(
"You are an AML/fraud analyst reviewing wire transfers and client activity "
"for unusual behavior, sanctions risk, and policy violations."
),
verbose=True,
)
compliance_reviewer = Agent(
role="Compliance Reviewer",
goal="Validate whether the suspected fraud aligns with AML/KYC escalation policy",
backstory=(
"You review cases for regulatory relevance, auditability, and escalation quality."
),
verbose=True,
)
case_summarizer = Agent(
role="Case Summarizer",
goal="Produce a concise investigation summary with recommended next action",
backstory=(
"You write crisp case notes for investigators and operations teams."
),
verbose=True,
)
3) Create tasks and run them in sequence
This is the core pattern. The first task detects risk; the second validates it against policy; the third produces an investigator-ready brief.
from crewai import Task, Crew, Process
def build_crew(transaction: TransactionInput) -> Crew:
detect_task = Task(
description=(
"Review this transaction for fraud/AML indicators:\n"
f"{transaction.model_dump()}\n\n"
"Return a concise risk assessment with indicators such as structuring, "
"unusual geography, velocity anomalies, counterparty mismatch, or sanctions exposure."
),
expected_output="A short fraud risk assessment with a risk level and reasons.",
agent=fraud_analyst,
output_file=f"case_{transaction.transaction_id}_detect.txt",
)
compliance_task = Task(
description=(
"Review the fraud assessment for compliance relevance. "
"State whether this should be escalated to AML/compliance investigation "
"and mention any policy concerns."
),
expected_output="An escalation recommendation with compliance rationale.",
agent=compliance_reviewer,
context=[detect_task],
output_file=f"case_{transaction.transaction_id}_compliance.txt",
)
summarize_task = Task(
description=(
"Write a final case summary for an investigator. Include key facts, "
"risk level, compliance recommendation, and next action."
),
expected_output="A structured case summary ready for human review.",
agent=case_summarizer,
context=[detect_task, compliance_task],
output_file=f"case_{transaction.transaction_id}_summary.txt",
)
return Crew(
agents=[fraud_analyst, compliance_reviewer, case_summarizer],
tasks=[detect_task, compliance_task, summarize_task],
process=Process.sequential,
verbose=True,
)
Run it from your service layer:
if __name__ == "__main__":
tx = TransactionInput(
transaction_id="TXN-100045",
customer_id="CUST-7781",
amount=985000.00,
currency="USD",
country="GB",
counterparty_country="AE",
channel="wire",
timestamp="2026-04-21T10:14:00Z",
kyc_risk_rating="high",
prior_alert_count=4,
notes="Multiple wires below reporting threshold over last 72 hours."
)
crew = build_crew(tx)
result = crew.kickoff()
print(result)
4) Add guardrails around output handling
Do not let the model directly trigger blocks or account freezes. In investment banking you want human approval on high-impact decisions.
def should_escalate(result_text: str) -> bool:
text = result_text.lower()
return any(keyword in text for keyword in ["high risk", "escalate", "suspicious", "sanctions", "aml"])
# Example usage after kickoff:
# if should_escalate(str(result)):
# send_to_case_management_system(...)
# else:
# log_for_monitoring(...)
Production Considerations
- •
Audit logging
- •Persist every input payload, task prompt, model response, timestamp, version hash, and human override.
- •Regulators will ask why the system escalated or ignored a case.
- •
Data residency
- •Keep client data in-region if your bank has jurisdictional constraints.
- •If you operate across UK/EU/US/APAC desks, route data to region-specific deployments.
- •
Monitoring
- •Track false positive rate, analyst override rate, average time-to-triage, and alert volume by segment.
- •A spike in alerts after a model change usually means your prompts drifted or your thresholds are too loose.
- •
Guardrails
- •Mask PII before sending data to the LLM when possible.
- •Block unsupported actions like “close account” or “freeze funds” unless routed through approved workflow systems with human authorization.
Common Pitfalls
- •
Using unstructured prompts without schema
- •Mistake: dumping raw JSON logs into one prompt and hoping the model infers what matters.
- •Fix: define a strict input model with Pydantic fields that reflect AML/fraud signals.
- •
Letting the agent make final decisions
- •Mistake: auto-blocking accounts based on LLM output alone.
- •Fix: use the agent for triage and explanation; keep enforcement behind deterministic rules plus human approval.
- •
Ignoring compliance traceability
- •Mistake: storing only the final answer.
- •Fix: store intermediate task outputs from
Task, plus source inputs and timestamps so internal audit can reconstruct the decision path.
- •
Deploying one global model endpoint
- •Mistake: sending EU client data through a US-hosted inference path.
- •Fix: separate deployments by region and enforce residency at the API gateway or service mesh layer.
Keep learning
- •The complete AI Agents Roadmap — my full 8-step breakdown
- •Free: The AI Agent Starter Kit — PDF checklist + starter code
- •Work with me — I build AI for banks and insurance companies
By Cyprian Aarons, AI Consultant at Topiax.
Want the complete 8-step roadmap?
Grab the free AI Agent Starter Kit — architecture templates, compliance checklists, and a 7-email deep-dive course.
Get the Starter Kit