How to Build a fraud detection Agent Using LangChain in Python for investment banking

By Cyprian AaronsUpdated 2026-04-21
fraud-detectionlangchainpythoninvestment-banking

A fraud detection agent in investment banking is not a chatbot that “flags suspicious activity” in the abstract. It is a workflow engine that ingests trade, payment, KYC, and behavioral signals, correlates them against policy and historical patterns, and returns an auditable decision with evidence attached.

That matters because in investment banking, false negatives are expensive and false positives are operationally disruptive. The agent has to help surveillance teams triage alerts faster without breaking compliance, data residency, or model governance requirements.

Architecture

  • Data ingestion layer

    • Pulls structured inputs from trade logs, payment events, CRM/KYC records, and case management systems.
    • Normalizes timestamps, account identifiers, counterparties, and jurisdiction metadata.
  • Risk rules engine

    • Encodes deterministic checks like threshold breaches, velocity anomalies, sanctioned entity matches, and unusual venue/counterparty combinations.
    • Produces machine-readable findings before any LLM reasoning happens.
  • LangChain decision agent

    • Uses ChatOpenAI plus tools to interpret the risk context and recommend next actions.
    • Produces structured outputs instead of free-form prose.
  • Evidence retrieval layer

    • Uses FAISS or another vector store for policy docs, AML procedures, trade surveillance playbooks, and prior cases.
    • Lets the agent cite internal policy when explaining why an alert matters.
  • Audit and case logging

    • Persists every input, tool call, model output, and final disposition.
    • Supports regulatory review and internal model governance.
  • Human escalation path

    • Routes high-risk or ambiguous cases to a compliance analyst.
    • Prevents the agent from making final decisions on regulated actions without review.

Implementation

1) Install dependencies and define the risk schema

You want the agent to emit structured output. In practice that means a Pydantic model for the fraud assessment and a small set of deterministic rules before the LLM gets involved.

from typing import List, Literal
from pydantic import BaseModel, Field

class FraudAssessment(BaseModel):
    risk_level: Literal["low", "medium", "high"] = Field(...)
    summary: str = Field(...)
    red_flags: List[str] = Field(default_factory=list)
    recommended_action: Literal["monitor", "escalate_to_compliance", "block_and_review"] = Field(...)
    evidence: List[str] = Field(default_factory=list)

2) Build retrieval over internal policy documents

For investment banking, your agent needs to ground its explanation in internal controls. A basic FAISS retriever is enough to start if you already have approved documents split into chunks.

from langchain_openai import OpenAIEmbeddings
from langchain_community.vectorstores import FAISS
from langchain_core.documents import Document

docs = [
    Document(page_content="Escalate any trade with unusual counterparties in restricted jurisdictions."),
    Document(page_content="Payments above threshold require enhanced due diligence review."),
    Document(page_content="Any match against sanctions screening must be reviewed by compliance before release."),
]

embeddings = OpenAIEmbeddings(model="text-embedding-3-small")
vectorstore = FAISS.from_documents(docs, embeddings)
retriever = vectorstore.as_retriever(search_kwargs={"k": 2})

3) Create the LangChain agent with tools

Use a tool for policy retrieval and another for deterministic scoring. Keep the model on a short leash: it should explain and prioritize findings, not invent facts.

import json
from langchain_openai import ChatOpenAI
from langchain_core.tools import tool
from langchain_core.prompts import ChatPromptTemplate
from langchain.agents import create_tool_calling_agent, AgentExecutor

@tool
def fetch_policy(query: str) -> str:
    """Retrieve relevant internal policy snippets."""
    results = retriever.get_relevant_documents(query)
    return "\n".join(doc.page_content for doc in results)

@tool
def rule_based_score(trade_amount: float, jurisdiction: str, sanctions_hit: bool) -> str:
    """Return deterministic fraud signals."""
    flags = []
    score = 0

    if trade_amount > 5000000:
        score += 30
        flags.append("Large trade value")
    if jurisdiction.lower() in {"iran", "north korea", "russia"}:
        score += 50
        flags.append("Restricted jurisdiction")
    if sanctions_hit:
        score += 80
        flags.append("Sanctions screening hit")

    risk_level = "high" if score >= 80 else "medium" if score >= 30 else "low"
    return json.dumps({"score": score, "risk_level": risk_level, "flags": flags})

llm = ChatOpenAI(model="gpt-4o-mini", temperature=0)

prompt = ChatPromptTemplate.from_messages([
    ("system",
     "You are a fraud detection analyst for investment banking. "
     "Use only provided tools and evidence. "
     "Return concise output grounded in policy."),
    ("human",
     "Investigate this event:\n"
     "{event}\n\n"
     "Return a fraud assessment with risk level, red flags, evidence, and action.")
])

agent = create_tool_calling_agent(llm=llm, tools=[fetch_policy, rule_based_score], prompt=prompt)
executor = AgentExecutor(agent=agent, tools=[fetch_policy, rule_based_score], verbose=False)

4) Run an assessment and parse the result

The event payload should be explicit. Include fields your downstream case management system can persist for audit purposes.

event = {
    "trade_amount": 12000000,
    "jurisdiction": "UAE",
    "sanctions_hit": False,
    "counterparty": "XYZ Capital",
    "product": "Equity swap",
    "notes": "Repeated booking changes within same day"
}

result = executor.invoke({"event": json.dumps(event)})
print(result["output"])

If you want stricter structure in production, wrap the final response with with_structured_output(FraudAssessment) on ChatOpenAI for the last classification step. That gives you typed output you can validate before writing to your case system.

Production Considerations

  • Compliance first

    • Keep the agent advisory-only unless legal has approved automated blocking.
    • Log every prompt, retrieved policy chunk, tool result, and final recommendation for audit trails.
  • Data residency

    • Host embeddings stores and model endpoints in approved regions only.
    • Do not send client PII or transaction details to unmanaged SaaS endpoints without contractual approval.
  • Monitoring

    • Track false positive rate by desk, product type, jurisdiction, and counterparty segment.
    • Monitor tool failure rates separately from model quality so you can isolate retrieval issues from reasoning issues.
  • Guardrails

    • Restrict tools to approved data sources; do not let the model browse arbitrary systems.
    • Add deterministic thresholds so sanctions hits and hard policy violations always escalate regardless of LLM output.

Common Pitfalls

  • Letting the LLM make final compliance decisions

    • Avoid this by using rules for hard stops and reserving the model for explanation and prioritization.
  • Feeding raw enterprise data without normalization

    • Trade IDs, booking dates, legal entity names, and jurisdiction codes need standardization before analysis.
  • Skipping auditability

    • If you cannot reconstruct why an alert was escalated six months later, your implementation is not production-ready.
  • Mixing public models with restricted data

    • Route sensitive banking data through approved infrastructure only. If your deployment crosses regions or vendors accidentally, that becomes a governance issue fast.

Keep learning

By Cyprian Aarons, AI Consultant at Topiax.

Want the complete 8-step roadmap?

Grab the free AI Agent Starter Kit — architecture templates, compliance checklists, and a 7-email deep-dive course.

Get the Starter Kit

Related Guides