How to Build a fraud detection Agent Using LangGraph in Python for retail banking
A fraud detection agent for retail banking watches transaction events, scores them against policy and behavioral signals, and decides whether to approve, step-up authenticate, hold for review, or escalate to an analyst. It matters because fraud losses are only half the problem; false positives also kill customer trust, create support load, and can trigger compliance issues if your decisions are inconsistent or not auditable.
Architecture
- •Event ingestion layer
- •Pulls card payments, ACH transfers, login events, beneficiary changes, and device fingerprints from your stream or API.
- •Feature enrichment layer
- •Adds account age, transaction velocity, geolocation distance, merchant category, device reputation, and historical risk signals.
- •Policy and decision layer
- •Applies bank rules like amount thresholds, sanctions checks, velocity limits, and country restrictions before any model call.
- •LLM reasoning layer
- •Explains why a case is suspicious in plain language for investigators and generates structured case notes.
- •Human escalation layer
- •Routes high-risk or ambiguous cases to a fraud analyst queue with evidence attached.
- •Audit and persistence layer
- •Stores inputs, outputs, decision path, timestamps, and model version for compliance review and dispute handling.
Implementation
- •Define the state that moves through the graph
For retail banking you want a typed state object that carries the transaction payload, enriched features, risk score, decision, and audit trail. Keep it explicit so every node can be tested in isolation.
from typing import TypedDict, Annotated
from langgraph.graph import StateGraph, START, END
from langgraph.graph.message import add_messages
class FraudState(TypedDict):
transaction: dict
features: dict
risk_score: float
decision: str
rationale: str
audit_log: list[str]
def enrich_features(state: FraudState) -> FraudState:
tx = state["transaction"]
features = {
"amount": tx["amount"],
"country_mismatch": tx["country"] != tx["account_country"],
"velocity_1h": tx.get("velocity_1h", 0),
"new_device": tx.get("new_device", False),
}
return {**state, "features": features}
def score_risk(state: FraudState) -> FraudState:
f = state["features"]
score = 0.0
score += 0.4 if f["country_mismatch"] else 0.0
score += 0.3 if f["new_device"] else 0.0
score += min(f["velocity_1h"] / 10.0, 0.3)
return {**state, "risk_score": min(score, 1.0)}
def decide(state: FraudState) -> FraudState:
score = state["risk_score"]
if score >= 0.8:
decision = "hold_for_review"
elif score >= 0.5:
decision = "step_up_auth"
else:
decision = "approve"
return {
**state,
"decision": decision,
"audit_log": state.get("audit_log", []) + [f"decision={decision}, score={score}"],
}
graph = StateGraph(FraudState)
graph.add_node("enrich_features", enrich_features)
graph.add_node("score_risk", score_risk)
graph.add_node("decide", decide)
graph.add_edge(START, "enrich_features")
graph.add_edge("enrich_features", "score_risk")
graph.add_edge("score_risk", "decide")
graph.add_edge("decide", END)
fraud_agent = graph.compile()
- •Add a conditional route for analyst escalation
LangGraph’s add_conditional_edges is the clean way to branch on risk without hardcoding nested if chains inside nodes. In banking systems this keeps your approval path separate from your investigation path.
from typing import Literal
def route_case(state: FraudState) -> Literal["approve", "step_up_auth", "hold_for_review"]:
return state["decision"]
def analyst_review(state: FraudState) -> FraudState:
note = (
f"Review required for amount={state['transaction']['amount']} "
f"risk={state['risk_score']}"
)
return {**state, "rationale": note}
workflow = StateGraph(FraudState)
workflow.add_node("enrich_features", enrich_features)
workflow.add_node("score_risk", score_risk)
workflow.add_node("decide", decide)
workflow.add_node("analyst_review", analyst_review)
workflow.add_edge(START, "enrich_features")
workflow.add_edge("enrich_features", "score_risk")
workflow.add_edge("score_risk", "decide")
workflow.add_conditional_edges(
"decide",
route_case,
{
"approve": END,
"step_up_auth": END,
"hold_for_review": "analyst_review",
},
)
workflow.add_edge("analyst_review", END)
fraud_agent = workflow.compile()
- •Run the graph with a real transaction payload
Use invoke() for synchronous scoring in request/response flows like card authorization or online transfer checks. For streaming event pipelines you can switch to stream() later without changing node logic.
sample_tx = {
"amount": 4200,
"country": "NG",
"account_country": "GB",
"velocity_1h": 18,
"new_device": True,
}
result = fraud_agent.invoke(
{
"transaction": sample_tx,
"features": {},
"risk_score": 0.0,
"decision": "",
"rationale": "",
"audit_log": [],
}
)
print(result["decision"])
print(result["audit_log"])
- •Add an LLM explanation node only after policy gates
Do not let an LLM make the final fraud decision by itself. Use it to summarize evidence after deterministic rules have already classified the case into a narrow band.
from langchain_openai import ChatOpenAI
llm = ChatOpenAI(model="gpt-4o-mini", temperature=0)
def explain_case(state: FraudState) -> FraudState:
prompt = (
f"Write a concise investigator note for this retail banking case:\n"
f"Transaction: {state['transaction']}\n"
f"Features: {state['features']}\n"
f"Risk score: {state['risk_score']}\n"
f"Decision: {state['decision']}"
)
response = llm.invoke(prompt)
return {**state, "rationale": response.content}
# Insert explain_case between decide and analyst_review if needed.
Production Considerations
- •
Compliance and auditability
- •Persist the full LangGraph state transition log with model versioning, rule versioning, and timestamps.
- •For adverse actions or manual holds, keep enough evidence to explain why the customer was blocked or challenged.
- •
Data residency
- •Keep transaction data in-region if your banking license requires it.
- •If you call external models, strip PII where possible and send only the minimum feature set needed for reasoning.
- •
Monitoring
- •Track false positive rate, false negative rate, analyst override rate, step-up auth conversion rate, and average time-to-decision.
- •Alert on drift in velocity patterns or geographic anomalies because fraud patterns change faster than quarterly model retraining cycles.
- •
Guardrails
- •Hard-code policy checks for sanctions lists, prohibited corridors, high-risk MCCs at the rule layer before any probabilistic step.
- •Never let the LLM override deterministic controls; it should explain or classify within boundaries you already defined.
Common Pitfalls
- •
Putting all logic inside one LLM prompt
- •This makes decisions non-deterministic and impossible to audit.
- •Split policy checks, scoring logic, and explanation into separate LangGraph nodes.
- •
Ignoring branch-specific test coverage
- •Teams test the happy path and miss escalation paths like
hold_for_review. - •Write unit tests against each node plus integration tests for each conditional edge.
- •Teams test the happy path and miss escalation paths like
- •
Skipping feature provenance
- •If you cannot trace where a risk signal came from during a dispute or regulator review, you have a problem.
- •Store source system IDs for every feature used in scoring so analysts can reconstruct the decision.
Keep learning
- •The complete AI Agents Roadmap — my full 8-step breakdown
- •Free: The AI Agent Starter Kit — PDF checklist + starter code
- •Work with me — I build AI for banks and insurance companies
By Cyprian Aarons, AI Consultant at Topiax.
Want the complete 8-step roadmap?
Grab the free AI Agent Starter Kit — architecture templates, compliance checklists, and a 7-email deep-dive course.
Get the Starter Kit