How to Build a fraud detection Agent Using CrewAI in Python for payments

By Cyprian AaronsUpdated 2026-04-21

fraud-detectioncrewaipythonpayments

A fraud detection agent for payments watches transaction context, scores suspicious behavior, and routes high-risk cases for review or blocking. It matters because payment fraud is a latency-sensitive problem: you need fast decisions, consistent audit trails, and a clear reason for every action taken on a card, wallet, or bank transfer.

Architecture

•
Transaction intake service
- •Receives payment events from your gateway, PSP, or internal ledger.
- •Normalizes fields like amount, currency, merchant category, device fingerprint, IP, and customer history.
•
Risk analysis agent
- •Uses CrewAI Agent to inspect the transaction and call tools for enrichment.
- •Produces a structured fraud assessment with risk score and rationale.
•
Enrichment tools
- •Pulls velocity checks, geo/IP reputation, chargeback history, account age, and prior disputes.
- •Keeps the agent grounded in actual payment signals instead of free-form reasoning.
•
Decision policy layer
- •Converts the agent output into actions: approve, step-up auth, hold for review, or decline.
- •Enforces hard rules that should never be overridden by an LLM.
•
Audit and case store
- •Persists inputs, outputs, tool calls, timestamps, and final decisions.
- •Required for PCI-adjacent controls, internal investigations, and regulator reviews.
•
Monitoring and feedback loop
- •Tracks false positives, false negatives, manual review outcomes, and drift in transaction patterns.
- •Feeds labeled outcomes back into your rules and prompts.

Implementation

1) Install CrewAI and define the risk tools

Use tools for deterministic checks. The agent should reason over evidence; it should not invent evidence.

from crewai.tools import BaseTool
from pydantic import BaseModel, Field
from typing import Dict


class TransactionInput(BaseModel):
    transaction_id: str = Field(..., description="Payment transaction identifier")
    amount: float = Field(..., description="Transaction amount")
    currency: str = Field(..., description="ISO currency code")
    customer_id: str = Field(..., description="Customer identifier")
    merchant_id: str = Field(..., description="Merchant identifier")
    ip_address: str = Field(..., description="Customer IP address")


class VelocityCheckTool(BaseTool):
    name: str = "velocity_check"
    description: str = "Checks recent transaction velocity for a customer"

    def _run(self, customer_id: str) -> Dict:
        # Replace with Redis/Postgres/feature store lookup
        recent_txn_count = 7
        return {"customer_id": customer_id, "last_10m_count": recent_txn_count}


class GeoRiskTool(BaseTool):
    name: str = "geo_risk_lookup"
    description: str = "Returns IP-based geo risk signal"

    def _run(self, ip_address: str) -> Dict:
        # Replace with MaxMind / internal risk service
        return {"ip_address": ip_address, "country": "NG", "risk_flag": True}

2) Create the fraud analyst agent

Keep the prompt narrow. In payments systems you want consistent outputs that downstream code can parse.

from crewai import Agent

fraud_agent = Agent(
    role="Payments Fraud Analyst",
    goal="Assess payment transactions for fraud risk using provided evidence only",
    backstory=(
        "You review payment events for fraud indicators such as velocity spikes,"
        " geo mismatch, unusual amount patterns, and merchant abuse."
        " You must return a concise risk assessment."
    ),
    tools=[VelocityCheckTool(), GeoRiskTool()],
    verbose=True,
)

3) Define a task that forces structured output

CrewAI tasks can carry explicit instructions. For production use JSON-like output so your policy engine can consume it safely.

from crewai import Task

fraud_task = Task(
    description=(
        "Analyze this payment transaction for fraud risk.\n"
        "Return:\n"
        "- risk_level: low|medium|high\n"
        "- score: integer 0-100\n"
        "- reasons: list of short strings\n"
        "- action: approve|step_up|review|decline\n\n"
        "Transaction:\n{transaction}"
    ),
    expected_output="A structured fraud decision with reasons and action.",
    agent=fraud_agent,
)

4) Run the crew and apply a hard policy gate

The LLM proposes; your policy decides. That separation matters when you need deterministic behavior under compliance review.

from crewai import Crew
import json


def decide_action(result_text: str) -> dict:
    # In production parse strict JSON from the model output.
    # Keep this example simple but explicit.
    if "decline" in result_text.lower():
        return {"final_action": "decline"}
    if "review" in result_text.lower():
        return {"final_action": "review"}
    if "step_up" in result_text.lower():
        return {"final_action": "step_up"}
    return {"final_action": "approve"}


transaction = TransactionInput(
    transaction_id="txn_123",
    amount=499.99,
    currency="USD",
    customer_id="cus_456",
    merchant_id="m_789",
    ip_address="102.88.12.44",
)

crew = Crew(
    agents=[fraud_agent],
    tasks=[fraud_task],
)

result = crew.kickoff(inputs={"transaction": transaction.model_dump_json()})
decision = decide_action(str(result))

print({"agent_result": str(result), **decision})

Production Considerations

•
Put hard limits outside the model
- •Block sanctioned geographies, impossible amounts, BIN-country mismatches, or known stolen cards before the agent runs.
- •The model should not be your first line of defense.
•
Log every decision path
- •Store raw inputs, tool outputs, prompt version, model version, final action.
- •For payments audits you need traceability from alert to outcome.
•
Respect data residency and PCI boundaries
- •Do not send PANs or sensitive authentication data to the model.
- •Tokenize card data and keep regional processing aligned with residency requirements.
•
Monitor precision by payment segment
- •Separate metrics by card-not-present vs card-present, geography, merchant category code (MCC), and channel.
- •A single global false-positive rate hides real losses in specific segments.

Common Pitfalls

•
Letting the agent make final authorization decisions
- •Avoid this by keeping approval/decline logic in a deterministic policy layer.
- •Use the agent for assessment; use code for enforcement.
•
Passing raw sensitive payment data into prompts
- •Never include full PANs, CVVs, or secrets.
- •Tokenize identifiers and pass only the minimum fields needed for analysis.
•
No feedback loop from chargebacks and manual reviews
- •Without labels you will drift fast.
- •Feed confirmed fraud and false positives back into your ruleset and tool signals so thresholds stay calibrated.

Keep learning

•The complete AI Agents Roadmap — my full 8-step breakdown
•Free: The AI Agent Starter Kit — PDF checklist + starter code
•Work with me — I build AI for banks and insurance companies

By Cyprian Aarons, AI Consultant at Topiax.

ShareX / Twitter LinkedIn

Want the complete 8-step roadmap?

Grab the free AI Agent Starter Kit — architecture templates, compliance checklists, and a 7-email deep-dive course.

Get the Starter Kit