How to Build a fraud detection Agent Using LlamaIndex in Python for wealth management
A fraud detection agent for wealth management ingests client activity, portfolio events, account changes, and internal policy documents, then flags suspicious patterns and explains why they matter. In this domain, the agent is not just looking for obvious fraud; it has to catch account takeover attempts, unusual withdrawal behavior, advisor impersonation, beneficiary changes, and policy breaches while preserving auditability and compliance.
Architecture
- •
Data ingestion layer
- •Pulls structured data from transaction logs, CRM events, case management systems, and custodian feeds.
- •Normalizes records into a common schema with timestamps, client IDs, advisor IDs, and risk signals.
- •
Knowledge index
- •Stores wealth-management policies, escalation playbooks, KYC/AML rules, and historical fraud cases.
- •Built with
VectorStoreIndexso the agent can retrieve relevant evidence before making a decision.
- •
Fraud reasoning toolchain
- •Uses LLM-driven analysis plus deterministic checks.
- •Combines retrieval from LlamaIndex with rule-based thresholds for things like large wire transfers or unusual login geography.
- •
Decision layer
- •Produces a structured output: risk score, reason codes, supporting evidence, and next action.
- •Routes high-risk cases to human review instead of auto-blocking everything.
- •
Audit and case logging
- •Persists prompts, retrieved context, model outputs, and final decisions.
- •Needed for compliance review, model governance, and post-incident investigations.
- •
Security and residency controls
- •Keeps client data in approved regions.
- •Redacts PII where possible and limits what goes into retrieval indexes.
Implementation
1) Install LlamaIndex and define your knowledge base
For wealth management fraud detection, start with policy documents and prior incident notes. You want the agent to answer questions like: “Does this transfer violate internal controls?” or “What escalation path applies to this pattern?”
pip install llama-index llama-index-llms-openai llama-index-embeddings-openai
from llama_index.core import VectorStoreIndex, SimpleDirectoryReader
from llama_index.core import Settings
from llama_index.llms.openai import OpenAI
from llama_index.embeddings.openai import OpenAIEmbedding
Settings.llm = OpenAI(model="gpt-4o-mini", temperature=0)
Settings.embed_model = OpenAIEmbedding(model="text-embedding-3-small")
docs = SimpleDirectoryReader("./wealth_policy_docs").load_data()
index = VectorStoreIndex.from_documents(docs)
query_engine = index.as_query_engine(similarity_top_k=3)
This gives you a retriever-backed knowledge layer. In practice, the folder should contain AML policies, wire transfer approval rules, suspicious activity playbooks, and advisor conduct guidelines.
2) Add a fraud analysis tool that combines rules with retrieval
You should not rely on the model alone. Use deterministic checks first, then ask the LLM to explain the risk in context.
from dataclasses import dataclass
from typing import List
@dataclass
class TransactionEvent:
client_id: str
amount: float
country: str
channel: str
description: str
def rule_score(event: TransactionEvent) -> int:
score = 0
if event.amount > 250000:
score += 40
if event.country not in {"US", "CA", "GB", "DE", "SG"}:
score += 25
if event.channel == "wire":
score += 15
return score
def analyze_event(event: TransactionEvent):
prompt = f"""
Client event:
- client_id: {event.client_id}
- amount: {event.amount}
- country: {event.country}
- channel: {event.channel}
- description: {event.description}
Return:
1) fraud risk summary
2) likely policy concerns
3) recommended action for a wealth manager
"""
response = query_engine.query(prompt)
return {
"rule_score": rule_score(event),
"llm_analysis": str(response),
}
sample = TransactionEvent(
client_id="C10291",
amount=500000,
country="AE",
channel="wire",
description="Urgent transfer requested after phone call from advisor assistant"
)
result = analyze_event(sample)
print(result["rule_score"])
print(result["llm_analysis"])
That pattern works because the rules catch obvious anomalies while the retriever grounds the explanation in actual policy text. In regulated environments, that separation matters more than fancy prompting.
3) Expose the agent as a case-review workflow
The output should be usable by operations staff. Keep it structured so downstream systems can route cases automatically.
import json
def build_case_payload(event: TransactionEvent):
analysis = analyze_event(event)
risk_level = "low"
if analysis["rule_score"] >= 50:
risk_level = "high"
elif analysis["rule_score"] >= 25:
risk_level = "medium"
payload = {
"client_id": event.client_id,
"risk_level": risk_level,
"rule_score": analysis["rule_score"],
"analysis": analysis["llm_analysis"],
"recommended_action": (
"Escalate to compliance" if risk_level == "high"
else "Review manually" if risk_level == "medium"
else "No action"
),
}
return payload
case_payload = build_case_payload(sample)
print(json.dumps(case_payload, indent=2))
This is the shape you want for integration with ServiceNow, Salesforce Financial Services Cloud, or an internal case management system. The agent should never be the final authority on blocking money movement without human review hooks.
4) Add traceability for audit
Wealth management teams will ask why a case was flagged. Log inputs, retrieved context identifiers, model version, and final decision.
from datetime import datetime
def audit_log(case_payload):
record = {
"timestamp": datetime.utcnow().isoformat(),
"client_id": case_payload["client_id"],
"risk_level": case_payload["risk_level"],
"rule_score": case_payload["rule_score"],
"recommended_action": case_payload["recommended_action"],
# store references to source docs or retrieval ids in real systems
"model": "gpt-4o-mini",
}
print(json.dumps(record))
audit_log(case_payload)
In production, send this to an immutable log store or SIEM. If you cannot reconstruct why the agent made a recommendation six months later, you do not have a production-grade control.
Production Considerations
- •
Deploy close to approved data regions
- •Wealth data often has residency constraints tied to jurisdiction and client domicile.
- •Keep embeddings stores and logs inside approved cloud regions.
- •
Monitor false positives by segment
- •High-net-worth clients often have large legitimate transfers.
- •Track precision/recall by advisor desk, client tier, geography, and transaction type so you do not drown operations in noise.
- •
Add guardrails around sensitive actions
- •The agent should recommend review or escalation.
- •It should not directly execute freezes or blocks unless wrapped in explicit policy controls and approvals.
- •
Version everything
- •Store prompt templates, policy doc versions, embedding model versions, and rule thresholds.
- •Compliance teams need reproducibility when they ask why a case was flagged last quarter but not today.
Common Pitfalls
- •
Using only an LLM without deterministic checks
- •Fraud detection in wealth management needs hard rules for thresholds and jurisdictional constraints.
- •Fix it by combining
VectorStoreIndexretrieval with explicit scoring logic.
- •
Indexing raw PII without controls
- •Client statements can contain account numbers, tax IDs, beneficiaries, and health-related trust details.
- •Redact sensitive fields before indexing and restrict access to source documents.
- •
Treating explanations as proof
- •A fluent explanation from the model is not evidence of fraud.
- •Always attach retrieved policy references or event metadata so compliance reviewers can verify the reasoning independently.
Keep learning
- •The complete AI Agents Roadmap — my full 8-step breakdown
- •Free: The AI Agent Starter Kit — PDF checklist + starter code
- •Work with me — I build AI for banks and insurance companies
By Cyprian Aarons, AI Consultant at Topiax.
Want the complete 8-step roadmap?
Grab the free AI Agent Starter Kit — architecture templates, compliance checklists, and a 7-email deep-dive course.
Get the Starter Kit