How to Build a transaction monitoring Agent Using LangGraph in Python for pension funds

By Cyprian AaronsUpdated 2026-04-21

transaction-monitoringlanggraphpythonpension-funds

A transaction monitoring agent for pension funds watches member and fund movements, scores them for suspicious or non-compliant behavior, and routes the right cases to compliance teams. It matters because pension money has a long liability horizon, strict governance, and heavy audit requirements; missed anomalies can become regulatory issues, while too many false positives waste analyst time.

Architecture

•
Ingestion layer
- •Pulls transactions from core pension admin systems, custodians, payment rails, and batch files.
- •Normalizes fields like member ID, employer ID, contribution type, amount, currency, timestamp, and jurisdiction.
•
Rules and feature extraction node
- •Computes deterministic checks before any model call.
- •Examples: duplicate contributions, unusual withdrawal timing, contribution caps exceeded, dormant account activity, and jurisdiction mismatch.
•
Risk scoring node
- •Produces a structured risk assessment from the transaction context.
- •Combines policy rules with LLM-assisted reasoning for edge cases that need narrative interpretation.
•
Case decision node
- •Decides whether to approve, hold for review, or escalate.
- •Keeps the decision output machine-readable so downstream case management can ingest it.
•
Audit trail store
- •Persists every input, intermediate state, final decision, and model version.
- •Required for compliance reviews and regulator queries.
•
Human review handoff
- •Sends high-risk or ambiguous cases to analysts with a concise explanation.
- •Keeps humans in the loop for member benefit protection and exception handling.

Implementation

1) Define the graph state and decision schema

Use a typed state so every node knows what it can read and write. For production monitoring, keep the final output structured; don’t let free-form text become your interface.

from __future__ import annotations

from typing import TypedDict, Literal
from pydantic import BaseModel, Field

class TransactionState(TypedDict):
    transaction_id: str
    member_id: str
    amount: float
    currency: str
    jurisdiction: str
    txn_type: str
    risk_score: int
    flags: list[str]
    decision: str
    rationale: str

class Decision(BaseModel):
    action: Literal["approve", "hold", "escalate"]
    risk_score: int = Field(ge=0, le=100)
    rationale: str

2) Build deterministic checks first

For pension funds, rules should catch obvious policy breaches before any probabilistic step. This reduces cost and makes audit explanations easier.

def rule_check(state: TransactionState) -> dict:
    flags = []

    if state["amount"] > 50000:
        flags.append("high_value_transaction")

    if state["txn_type"] == "withdrawal" and state["jurisdiction"] not in {"GB", "IE", "ZA"}:
        flags.append("cross_jurisdiction_withdrawal")

    if state["amount"] % 1000 == 0 and state["amount"] >= 10000:
        flags.append("round_amount_pattern")

    risk_score = min(100, len(flags) * 25)

    return {
        "flags": flags,
        "risk_score": risk_score,
        "rationale": f"Rule check found {len(flags)} trigger(s): {', '.join(flags) if flags else 'none'}",
    }

3) Add an LLM-backed review node with LangGraph

This pattern uses StateGraph, add_node, add_edge, set_entry_point, and compile. The LLM should only make decisions on top of deterministic signals; keep the prompt narrow and the output structured.

from langgraph.graph import StateGraph, END
from langchain_openai import ChatOpenAI

llm = ChatOpenAI(model="gpt-4o-mini", temperature=0)

def llm_review(state: TransactionState) -> dict:
    prompt = f"""
You are reviewing a pension fund transaction for compliance triage.

Transaction:
- id: {state['transaction_id']}
- member_id: {state['member_id']}
- amount: {state['amount']}
- currency: {state['currency']}
- jurisdiction: {state['jurisdiction']}
- type: {state['txn_type']}
- flags: {state.get('flags', [])}
- current_risk_score: {state.get('risk_score', 0)}

Return one of:
approve | hold | escalate

Keep rationale short and specific to pension-fund compliance.
"""
    resp = llm.invoke(prompt).content.strip().lower()

    if "escalate" in resp:
        action = "escalate"
        score = max(state.get("risk_score", 0), 80)
    elif "hold" in resp:
        action = "hold"
        score = max(state.get("risk_score", 0), 60)
    else:
        action = "approve"
        score = min(state.get("risk_score", 0), 30)

    return {
        "decision": action,
        "risk_score": score,
        "rationale": f"LLM review returned '{action}' after rule-based screening.",
    }

def finalize_decision(state: TransactionState) -> dict:
    score = state.get("risk_score", 0)
    if score >= 80:
        decision = "escalate"
    elif score >= 50:
        decision = "hold"
    else:
        decision = "approve"

    return {"decision": decision}

graph = StateGraph(TransactionState)
graph.add_node("rule_check", rule_check)
graph.add_node("llm_review", llm_review)
graph.add_node("finalize_decision", finalize_decision)

graph.set_entry_point("rule_check")
graph.add_edge("rule_check", "llm_review")
graph.add_edge("llm_review", "finalize_decision")
graph.add_edge("finalize_decision", END)

app = graph.compile()

4) Run the agent and persist the audit record

The compiled graph returns the final merged state. In production, store both the raw input and each node’s outputs with model versioning so auditors can reconstruct why a case was held or escalated.

initial_state: TransactionState = {
    "transaction_id": "TXN-10001",
    "member_id": "MEM-77821",
    "amount": 75000.0,
    "currency": "USD",
    "jurisdiction": "US",
    "txn_type": "withdrawal",
    "risk_score": 0,
    "flags": [],
    "decision": "",
    "rationale": "",
}

result = app.invoke(initial_state)

print(result["decision"])
print(result["risk_score"])
print(result["rationale"])

Production Considerations

•
Deployment
- •Keep the graph stateless at runtime and externalize persistence to Postgres or an event store.
- •Run workers in-region if data residency applies; pension data often cannot leave approved jurisdictions.
•
Monitoring
- •Track false-positive rate, escalation rate, average review latency, and rule hit distribution.
- •Alert when model output drifts from historical baselines or when one rule dominates outcomes.
•
Guardrails
- •Enforce JSON-schema-like structured outputs at every decision point.
- •Block direct auto-release on high-risk withdrawal patterns until a human approves them.
•
Auditability
- •Log model name, prompt version, graph version hash, input payload hash, and final action.
- •Keep immutable records for regulator reviews and internal control testing.

Common Pitfalls

•
Letting the LLM decide without deterministic pre-checks
- •Fix it by running policy rules first.
- •Pension workflows need explainable triggers before probabilistic reasoning enters the flow.
•
Using unstructured free-text outputs
- •Fix it by forcing a small action vocabulary like approve, hold, escalate.
- •This makes downstream case management reliable and keeps audit logs usable.
•
Ignoring residency and retention requirements
- •Fix it by pinning storage and inference regions to approved jurisdictions.
- •Also define retention windows per regulation; pension fund records often have longer retention than retail banking logs.

Keep learning

•The complete AI Agents Roadmap — my full 8-step breakdown
•Free: The AI Agent Starter Kit — PDF checklist + starter code
•Work with me — I build AI for banks and insurance companies

By Cyprian Aarons, AI Consultant at Topiax.

ShareX / Twitter LinkedIn

Want the complete 8-step roadmap?

Grab the free AI Agent Starter Kit — architecture templates, compliance checklists, and a 7-email deep-dive course.

Get the Starter Kit