How to Build a fraud detection Agent Using CrewAI in Python for payments
A fraud detection agent for payments watches transaction context, scores suspicious behavior, and routes high-risk cases for review or blocking. It matters because payment fraud is a latency-sensitive problem: you need fast decisions, consistent audit trails, and a clear reason for every action taken on a card, wallet, or bank transfer.
Architecture
- •
Transaction intake service
- •Receives payment events from your gateway, PSP, or internal ledger.
- •Normalizes fields like amount, currency, merchant category, device fingerprint, IP, and customer history.
- •
Risk analysis agent
- •Uses CrewAI
Agentto inspect the transaction and call tools for enrichment. - •Produces a structured fraud assessment with risk score and rationale.
- •Uses CrewAI
- •
Enrichment tools
- •Pulls velocity checks, geo/IP reputation, chargeback history, account age, and prior disputes.
- •Keeps the agent grounded in actual payment signals instead of free-form reasoning.
- •
Decision policy layer
- •Converts the agent output into actions: approve, step-up auth, hold for review, or decline.
- •Enforces hard rules that should never be overridden by an LLM.
- •
Audit and case store
- •Persists inputs, outputs, tool calls, timestamps, and final decisions.
- •Required for PCI-adjacent controls, internal investigations, and regulator reviews.
- •
Monitoring and feedback loop
- •Tracks false positives, false negatives, manual review outcomes, and drift in transaction patterns.
- •Feeds labeled outcomes back into your rules and prompts.
Implementation
1) Install CrewAI and define the risk tools
Use tools for deterministic checks. The agent should reason over evidence; it should not invent evidence.
from crewai.tools import BaseTool
from pydantic import BaseModel, Field
from typing import Dict
class TransactionInput(BaseModel):
transaction_id: str = Field(..., description="Payment transaction identifier")
amount: float = Field(..., description="Transaction amount")
currency: str = Field(..., description="ISO currency code")
customer_id: str = Field(..., description="Customer identifier")
merchant_id: str = Field(..., description="Merchant identifier")
ip_address: str = Field(..., description="Customer IP address")
class VelocityCheckTool(BaseTool):
name: str = "velocity_check"
description: str = "Checks recent transaction velocity for a customer"
def _run(self, customer_id: str) -> Dict:
# Replace with Redis/Postgres/feature store lookup
recent_txn_count = 7
return {"customer_id": customer_id, "last_10m_count": recent_txn_count}
class GeoRiskTool(BaseTool):
name: str = "geo_risk_lookup"
description: str = "Returns IP-based geo risk signal"
def _run(self, ip_address: str) -> Dict:
# Replace with MaxMind / internal risk service
return {"ip_address": ip_address, "country": "NG", "risk_flag": True}
2) Create the fraud analyst agent
Keep the prompt narrow. In payments systems you want consistent outputs that downstream code can parse.
from crewai import Agent
fraud_agent = Agent(
role="Payments Fraud Analyst",
goal="Assess payment transactions for fraud risk using provided evidence only",
backstory=(
"You review payment events for fraud indicators such as velocity spikes,"
" geo mismatch, unusual amount patterns, and merchant abuse."
" You must return a concise risk assessment."
),
tools=[VelocityCheckTool(), GeoRiskTool()],
verbose=True,
)
3) Define a task that forces structured output
CrewAI tasks can carry explicit instructions. For production use JSON-like output so your policy engine can consume it safely.
from crewai import Task
fraud_task = Task(
description=(
"Analyze this payment transaction for fraud risk.\n"
"Return:\n"
"- risk_level: low|medium|high\n"
"- score: integer 0-100\n"
"- reasons: list of short strings\n"
"- action: approve|step_up|review|decline\n\n"
"Transaction:\n{transaction}"
),
expected_output="A structured fraud decision with reasons and action.",
agent=fraud_agent,
)
4) Run the crew and apply a hard policy gate
The LLM proposes; your policy decides. That separation matters when you need deterministic behavior under compliance review.
from crewai import Crew
import json
def decide_action(result_text: str) -> dict:
# In production parse strict JSON from the model output.
# Keep this example simple but explicit.
if "decline" in result_text.lower():
return {"final_action": "decline"}
if "review" in result_text.lower():
return {"final_action": "review"}
if "step_up" in result_text.lower():
return {"final_action": "step_up"}
return {"final_action": "approve"}
transaction = TransactionInput(
transaction_id="txn_123",
amount=499.99,
currency="USD",
customer_id="cus_456",
merchant_id="m_789",
ip_address="102.88.12.44",
)
crew = Crew(
agents=[fraud_agent],
tasks=[fraud_task],
)
result = crew.kickoff(inputs={"transaction": transaction.model_dump_json()})
decision = decide_action(str(result))
print({"agent_result": str(result), **decision})
Production Considerations
- •
Put hard limits outside the model
- •Block sanctioned geographies, impossible amounts, BIN-country mismatches, or known stolen cards before the agent runs.
- •The model should not be your first line of defense.
- •
Log every decision path
- •Store raw inputs, tool outputs, prompt version, model version, final action.
- •For payments audits you need traceability from alert to outcome.
- •
Respect data residency and PCI boundaries
- •Do not send PANs or sensitive authentication data to the model.
- •Tokenize card data and keep regional processing aligned with residency requirements.
- •
Monitor precision by payment segment
- •Separate metrics by card-not-present vs card-present, geography, merchant category code (MCC), and channel.
- •A single global false-positive rate hides real losses in specific segments.
Common Pitfalls
- •
Letting the agent make final authorization decisions
- •Avoid this by keeping approval/decline logic in a deterministic policy layer.
- •Use the agent for assessment; use code for enforcement.
- •
Passing raw sensitive payment data into prompts
- •Never include full PANs, CVVs, or secrets.
- •Tokenize identifiers and pass only the minimum fields needed for analysis.
- •
No feedback loop from chargebacks and manual reviews
- •Without labels you will drift fast.
- •Feed confirmed fraud and false positives back into your ruleset and tool signals so thresholds stay calibrated.
Keep learning
- •The complete AI Agents Roadmap — my full 8-step breakdown
- •Free: The AI Agent Starter Kit — PDF checklist + starter code
- •Work with me — I build AI for banks and insurance companies
By Cyprian Aarons, AI Consultant at Topiax.
Want the complete 8-step roadmap?
Grab the free AI Agent Starter Kit — architecture templates, compliance checklists, and a 7-email deep-dive course.
Get the Starter Kit