How to Build a fraud detection Agent Using LangGraph in Python for lending
A fraud detection agent for lending screens an application, scores risk signals, and decides whether to approve, reject, or route the case to manual review. In lending, that matters because fraud is not just a loss problem; it affects compliance, underwriting accuracy, charge-offs, and your ability to explain decisions to auditors and regulators.
Architecture
- •
Input normalizer
- •Takes raw application data from LOS/CRM/KYC systems.
- •Standardizes fields like name, address, employer, income, device ID, and IP metadata.
- •
Risk signal extractor
- •Pulls structured checks from rules or external services.
- •Examples: identity mismatch, velocity spikes, synthetic identity indicators, bureau inconsistencies.
- •
LLM reasoning node
- •Summarizes evidence into a short fraud rationale.
- •Should never be the only decision-maker; use it to explain and triage.
- •
Decision router
- •Converts signals into one of three actions:
- •
approve - •
reject - •
manual_review
- •
- •Converts signals into one of three actions:
- •
Audit logger
- •Persists every input signal, intermediate decision, and final outcome.
- •Needed for model governance and lender audit trails.
- •
Policy guardrail layer
- •Enforces lending-specific constraints.
- •Example: if residency rules require EU-only processing, block non-compliant routes before any external API call.
Implementation
1) Define the state and decision schema
Use a typed state so each node in the graph has a clear contract. For lending workflows, keep both the raw applicant data and derived fraud signals in state so you can audit the full path later.
from typing import TypedDict, Literal, Annotated
from langgraph.graph import StateGraph, START, END
from langgraph.graph.message import add_messages
Decision = Literal["approve", "reject", "manual_review"]
class FraudState(TypedDict):
applicant_id: str
applicant_data: dict
risk_signals: dict
fraud_rationale: str
decision: Decision
audit_log: list[str]
2) Build deterministic checks first
Do not start with the LLM. In lending fraud detection, deterministic rules catch most obvious cases and are easier to defend in an audit.
def normalize_applicant(state: FraudState) -> FraudState:
data = state["applicant_data"]
normalized = {
"full_name": data["full_name"].strip().lower(),
"email_domain": data["email"].split("@")[-1].lower(),
"income": float(data["income"]),
"country": data["country"].upper(),
"ip_country": data.get("ip_country", "").upper(),
}
return {
**state,
"applicant_data": normalized,
"audit_log": state.get("audit_log", []) + ["normalized applicant fields"],
}
def extract_risk_signals(state: FraudState) -> FraudState:
data = state["applicant_data"]
signals = {
"country_mismatch": data["country"] != data["ip_country"],
"free_email": data["email_domain"] in {"gmail.com", "yahoo.com", "outlook.com"},
"high_income_low_signal": data["income"] > 200000,
}
return {
**state,
"risk_signals": signals,
"audit_log": state.get("audit_log", []) + [f"signals={signals}"],
}
3) Add an LLM-based explanation node and route by policy
Use the LLM to summarize evidence and produce a concise rationale. Then route using explicit policy logic so the graph remains deterministic where it matters.
from langchain_openai import ChatOpenAI
llm = ChatOpenAI(model="gpt-4o-mini", temperature=0)
def explain_fraud_risk(state: FraudState) -> FraudState:
prompt = f"""
You are assisting with lending fraud triage.
Applicant ID: {state['applicant_id']}
Risk signals: {state['risk_signals']}
Return a short rationale focused on fraud indicators only.
"""
response = llm.invoke(prompt)
return {
**state,
"fraud_rationale": response.content.strip(),
"audit_log": state.get("audit_log", []) + ["generated rationale"],
}
def decide(state: FraudState) -> FraudState:
s = state["risk_signals"]
if s["country_mismatch"] and s["free_email"]:
decision = "manual_review"
elif s["country_mismatch"]:
decision = "manual_review"
elif s["high_income_low_signal"]:
decision = "manual_review"
else:
decision = "approve"
return {
**state,
"decision": decision,
"audit_log": state.get("audit_log", []) + [f"decision={decision}"],
}
4) Compile the LangGraph workflow
This is the actual LangGraph pattern: define nodes with add_node, connect them with add_edge, set conditional routing when needed, then compile.
def build_graph():
graph = StateGraph(FraudState)
graph.add_node("normalize_applicant", normalize_applicant)
graph.add_node("extract_risk_signals", extract_risk_signals)
graph.add_node("explain_fraud_risk", explain_fraud_risk)
graph.add_node("decide", decide)
graph.add_edge(START, "normalize_applicant")
graph.add_edge("normalize_applicant", "extract_risk_signals")
def route_after_signals(state: FraudState):
# In production you can short-circuit hard fails here.
return "explain_fraud_risk"
graph.add_conditional_edges(
"extract_risk_signals",
route_after_signals,
{"explain_fraud_risk": "explain_fraud_risk"}
)
graph.add_edge("explain_fraud_risk", "decide")
graph.add_edge("decide", END)
return graph.compile()
app = build_graph()
result = app.invoke({
"applicant_id": "LN-100045",
"applicant_data": {
"full_name": "Jane Doe",
"email": "[email protected]",
"income": 180000,
"country": "GB",
"ip_country": "GB",
},
"risk_signals": {},
"fraud_rationale": "",
"decision": "",
"audit_log": [],
})
print(result["decision"])
print(result["fraud_rationale"])
Production Considerations
- •
Keep hard decisions deterministic
- •Use rules for obvious fraud patterns and reserve the LLM for explanation or triage.
- •This is easier to validate under model risk management.
- •
Log everything needed for audit
- •Persist input payload hashes, extracted signals, node outputs, final decision, model version, and prompt version.
- •Lending teams will need this for adverse action reviews and internal audits.
- •
Respect data residency
- •If applications include PII or credit-related data across regions, keep inference in-region.
- •Do not send sensitive lending data to third-party APIs without legal approval and DPA coverage.
- •
Monitor drift by segment
- •Track false positives by product type: personal loans vs SME lending behave differently.
- •Watch for proxy bias in features like ZIP code, device geography, or email domain.
Common Pitfalls
- •
Letting the LLM make the final call
- •Bad pattern: “model says fraud.”
- •Fix: use explicit rule-based routing for approve/reject/manual review and let the model justify or summarize.
- •
Skipping auditability
- •If you cannot reconstruct why an application was routed to review, you will have problems with compliance.
- •Fix: store node-level outputs in a durable log with timestamps and versioned prompts.
- •
Using weak normalization
- •Small inconsistencies in names, country codes, or income formats create noisy downstream results.
- •Fix: normalize before scoring and validate inputs at the edge of the system.
- •
Ignoring jurisdiction-specific constraints
- •A lender operating across regions may have different retention rules or prohibited attributes.
- •Fix: add policy checks early in the graph so restricted cases never reach external calls or unsupported models.
Keep learning
- •The complete AI Agents Roadmap — my full 8-step breakdown
- •Free: The AI Agent Starter Kit — PDF checklist + starter code
- •Work with me — I build AI for banks and insurance companies
By Cyprian Aarons, AI Consultant at Topiax.
Want the complete 8-step roadmap?
Grab the free AI Agent Starter Kit — architecture templates, compliance checklists, and a 7-email deep-dive course.
Get the Starter Kit