How to Build a fraud detection Agent Using CrewAI in Python for fintech

By Cyprian AaronsUpdated 2026-04-21

fraud-detectioncrewaipythonfintech

A fraud detection agent in fintech takes transaction data, customer context, and policy rules, then triages suspicious activity into clear decisions: approve, review, or escalate. The value is not just flagging fraud faster; it is reducing false positives, preserving customer experience, and giving compliance teams an auditable trail for every decision.

Architecture

•
Transaction intake layer
- •Pulls events from your payment gateway, card processor, or core banking stream.
- •Normalizes fields like amount, merchant category, IP geolocation, device fingerprint, and account age.
•
Risk enrichment layer
- •Adds context from internal systems: KYC status, previous chargebacks, velocity limits, device history, and customer segment.
- •This is where most fraud signal quality comes from.
•
CrewAI agent layer
- •A Fraud Analyst Agent evaluates the case against rules and patterns.
- •A Compliance Reviewer Agent checks policy constraints and escalation thresholds.
- •A Decision Synthesizer Agent turns findings into a structured recommendation.
•
Tooling layer
- •Tools for fetching transaction history, querying customer profiles, and writing audit logs.
- •Keep tools narrow and deterministic. Don’t let the agent “invent” data.
•
Policy and audit layer
- •Enforces compliance rules like explainability, retention, and region-specific data handling.
- •Stores why a transaction was flagged, which signals were used, and who approved the final action.

Implementation

1) Install CrewAI and define the core tools

Use CrewAI’s Agent, Task, Crew, Process, and BaseTool classes. For fintech work, keep tools explicit so every lookup can be logged and reviewed later.

from crewai import Agent, Task, Crew, Process
from crewai.tools import BaseTool
from pydantic import BaseModel

class TransactionInput(BaseModel):
    transaction_id: str
    amount: float
    currency: str
    merchant_country: str
    ip_country: str
    account_age_days: int
    velocity_1h: int
    chargeback_count_90d: int

class RiskLookupTool(BaseTool):
    name: str = "risk_lookup"
    description: str = "Fetches deterministic risk context for a transaction."

    def _run(self, transaction_id: str) -> str:
        # Replace with real DB/API calls
        return f"transaction_id={transaction_id}; kyc=verified; device=known; residency=EU"

class AuditLogTool(BaseTool):
    name: str = "audit_log"
    description: str = "Writes a structured audit record."

    def _run(self, record_json: str) -> str:
        # Replace with real immutable audit storage
        return f"logged={record_json[:120]}"

2) Create agents with narrow responsibilities

Don’t build one giant “fraud agent.” Split analysis from compliance review. That makes outputs easier to test and gives you better traceability when regulators ask how a decision was made.

fraud_analyst = Agent(
    role="Fraud Analyst",
    goal="Assess whether a transaction shows fraud indicators using available evidence.",
    backstory="You evaluate card-not-present fraud patterns and produce concise risk findings.",
    tools=[RiskLookupTool()],
    verbose=True,
)

compliance_reviewer = Agent(
    role="Compliance Reviewer",
    goal="Check whether the proposed action follows fintech policy and regulatory constraints.",
    backstory="You ensure decisions respect KYC/AML policy, audit requirements, and data residency rules.",
    verbose=True,
)

3) Define tasks that force structured outputs

The key pattern is to make each task produce a bounded artifact. For fraud workflows that means a risk summary first, then a policy decision second. Use the output of one task as input to the next.

def build_fraud_crew(transaction: TransactionInput):
    analyze_task = Task(
        description=(
            f"Analyze this transaction for fraud risk:\n"
            f"{transaction.model_dump_json()}\n\n"
            "Use the risk_lookup tool if needed. Return:\n"
            "- top risk signals\n"
            "- estimated risk level: low/medium/high\n"
            "- recommended action"
        ),
        expected_output="A concise fraud assessment with risk level and action.",
        agent=fraud_analyst,
    )

    review_task = Task(
        description=(
            "Review the fraud assessment for compliance constraints.\n"
            "Confirm whether the recommendation can be executed under policy.\n"
            "Return final decision with an audit-friendly explanation."
        ),
        expected_output="Final compliant decision with rationale.",
        agent=compliance_reviewer,
        context=[analyze_task],
        tools=[AuditLogTool()],
    )

    return Crew(
        agents=[fraud_analyst, compliance_reviewer],
        tasks=[analyze_task, review_task],
        process=Process.sequential,
        verbose=True,
    )

4) Run the crew and persist the decision

In production you want a JSON result you can store in your case management system. The crew output should map cleanly to downstream actions like auto-blocking a card or sending a case to analysts.

if __name__ == "__main__":
    tx = TransactionInput(
        transaction_id="tx_12345",
        amount=1890.50,
        currency="EUR",
        merchant_country="RO",
        ip_country="NG",
        account_age_days=4,
        velocity_1h=12,
        chargeback_count_90d=2,
    )

    crew = build_fraud_crew(tx)
    result = crew.kickoff()

    print(result)

Production Considerations

•
Keep sensitive data in-region
- •If your customers are in the EU or UK, don’t send raw PII outside approved regions.
- •Redact names, account numbers, and full addresses before passing context to the model.
•
Log every decision path
- •Store input features, tool calls, agent outputs, final decision, timestamp, model version, and policy version.
- •Fraud teams need replayable decisions for disputes and regulator reviews.
•
Put hard guardrails around actions
- •Let the agent recommend block/review/allow.
- •Keep final execution behind deterministic rules or human approval for high-value transactions.
•
Monitor false positives aggressively
- •Track precision by segment: new users vs existing users, domestic vs cross-border, high-value vs low-value.
- •A fraud model that blocks good customers creates direct revenue loss.

Common Pitfalls

•
Using one agent for everything
- •This mixes investigation logic with compliance logic.
- •Split responsibilities so each task has one job and one output format.
•
Letting the agent query uncontrolled data sources
- •If it can browse arbitrary systems or infer missing facts, your audit trail breaks.
- •Restrict tools to approved APIs with deterministic responses.
•
Ignoring explainability requirements
- •“High risk because it feels suspicious” is useless in fintech.
- •Always capture concrete signals like velocity spikes, country mismatch, device novelty, or chargeback history.

If you build this pattern correctly, CrewAI becomes a workflow engine for fraud triage rather than an unpredictable chatbot. That is what you want in fintech: fast decisions backed by evidence, policy checks baked in, and enough audit detail to survive production incidents and compliance reviews.

Keep learning

•The complete AI Agents Roadmap — my full 8-step breakdown
•Free: The AI Agent Starter Kit — PDF checklist + starter code
•Work with me — I build AI for banks and insurance companies

By Cyprian Aarons, AI Consultant at Topiax.

ShareX / Twitter LinkedIn

Want the complete 8-step roadmap?

Grab the free AI Agent Starter Kit — architecture templates, compliance checklists, and a 7-email deep-dive course.

Get the Starter Kit