How to Build a transaction monitoring Agent Using LangChain in Python for fintech
A transaction monitoring agent watches payment and transfer activity, scores each event for risk, explains why it flagged something, and routes suspicious cases to the right workflow. For fintech, this matters because you need fast detection, consistent decisions, and an audit trail that can survive compliance review.
Architecture
- •
Transaction ingestion layer
- •Pulls events from Kafka, SQS, a webhook, or a database stream.
- •Normalizes fields like amount, currency, merchant category, country, account age, and velocity metrics.
- •
Risk feature builder
- •Computes deterministic signals before the LLM sees anything.
- •Examples: daily spend delta, geo mismatch, first-time beneficiary, rapid retries.
- •
LangChain decision agent
- •Uses a
ChatOpenAImodel wrapped in aChatPromptTemplate. - •Produces structured outputs like risk score, reason codes, and next action.
- •Uses a
- •
Policy and compliance guardrails
- •Enforces hard rules outside the model.
- •Example: block sanctioned countries immediately; never let the model override AML policy.
- •
Case management sink
- •Writes results to your alert queue, case system, or SIEM.
- •Stores input features, model output, prompt version, and timestamp for auditability.
- •
Human review path
- •Escalates medium/high-risk cases to an analyst.
- •Keeps low-confidence decisions out of automatic enforcement.
Implementation
1) Install the right packages
Use LangChain’s current split packages. For a production service you want explicit dependencies rather than a single monolith import path.
pip install langchain langchain-openai pydantic
Set your API key in the environment:
export OPENAI_API_KEY="your-key"
2) Define the transaction schema and risk rules
Keep deterministic checks outside the LLM. The model should explain and classify; your code should enforce policy.
from typing import Literal
from pydantic import BaseModel, Field
class Transaction(BaseModel):
transaction_id: str
account_id: str
amount: float
currency: str
country: str
merchant_category: str
account_age_days: int
velocity_1h: int = Field(description="Number of transactions in the last hour")
is_new_beneficiary: bool
SANCTIONED_COUNTRIES = {"IR", "KP", "SY"}
def hard_block(tx: Transaction) -> bool:
return tx.country in SANCTIONED_COUNTRIES or tx.amount > 1000000
3) Build a structured LangChain agent for classification
Use ChatPromptTemplate, ChatOpenAI, and PydanticOutputParser so you get machine-readable output. This is better than free-form text because downstream systems need stable fields.
from typing import Literal
from pydantic import BaseModel, Field
from langchain_openai import ChatOpenAI
from langchain_core.prompts import ChatPromptTemplate
from langchain_core.output_parsers import PydanticOutputParser
class MonitoringDecision(BaseModel):
risk_level: Literal["low", "medium", "high"]
reason_codes: list[str] = Field(default_factory=list)
recommended_action: Literal["allow", "review", "block"]
summary: str
parser = PydanticOutputParser(pydantic_object=MonitoringDecision)
prompt = ChatPromptTemplate.from_messages([
("system",
"You are a transaction monitoring analyst for a fintech company. "
"Apply AML/fraud reasoning. Never override hard policy rules. "
"Return only valid structured output."),
("human",
"Transaction:\n{transaction_json}\n\n"
"Known risk signals:\n{signals}\n\n"
"{format_instructions}")
])
llm = ChatOpenAI(model="gpt-4o-mini", temperature=0)
agent_chain = prompt | llm | parser
def build_signals(tx: Transaction) -> list[str]:
signals = []
if tx.velocity_1h >= 10:
signals.append("high_velocity")
if tx.is_new_beneficiary:
signals.append("new_beneficiary")
if tx.account_age_days < 30:
signals.append("new_account")
if tx.amount > 5000:
signals.append("high_amount")
return signals
def monitor_transaction(tx_dict: dict):
tx = Transaction(**tx_dict)
if hard_block(tx):
return {
"transaction_id": tx.transaction_id,
"risk_level": "high",
"recommended_action": "block",
"reason_codes": ["policy_block"],
"summary": "Blocked by deterministic policy rule."
}
result = agent_chain.invoke({
"transaction_json": tx.model_dump_json(),
"signals": build_signals(tx),
"format_instructions": parser.get_format_instructions()
})
return {
"transaction_id": tx.transaction_id,
**result.model_dump()
}
4) Run it on a sample transaction
This pattern fits a worker consuming events from your queue. The important part is that the LLM only handles interpretation after policy checks have already run.
sample_tx = {
"transaction_id": "tx_12345",
"account_id": "acct_999",
"amount": 7800.50,
"currency": "USD",
"country": "GB",
"merchant_category": "digital_goods",
"account_age_days": 12,
"velocity_1h": 14,
"is_new_beneficiary": True,
}
decision = monitor_transaction(sample_tx)
print(decision)
Production Considerations
- •
Keep compliance logic deterministic
- •Sanctions screening, threshold blocks, and residency checks should live in code or dedicated rules engines.
- •The LLM should never be the final authority for AML or fraud controls.
- •
Log everything needed for audit
- •Persist input payload hashes, extracted features, prompt version, model name, output JSON, and human reviewer actions.
- •Regulators will ask why a case was flagged; you need reproducible evidence.
- •
Respect data residency
- •If customer data must stay in-region, route prompts through region-bound infrastructure or use an approved private deployment.
- •Avoid sending raw PII unless you have explicit legal basis and masking controls.
- •
Add confidence-based routing
- •Low-risk cases can auto-close.
- •Medium-risk cases go to analysts.
- •High-risk or policy-triggered cases should block immediately and create an alert.
Common Pitfalls
- •
Letting the model make policy decisions
- •Mistake: asking the LLM whether to ignore sanctions or threshold rules.
- •Fix: enforce hard blocks before any model call.
- •
Using free-form outputs
- •Mistake: parsing plain English summaries with regex.
- •Fix: use
PydanticOutputParseror another structured output approach so downstream systems get stable fields.
- •
Ignoring prompt/version traceability
- •Mistake: changing prompts without tracking which version produced which alert.
- •Fix: store prompt templates, model name, temperature, and chain version with every decision record.
- •
Sending raw sensitive data everywhere
- •Mistake: dumping full customer profiles into prompts.
- •Fix: minimize payloads. Mask account numbers, remove unnecessary PII, and keep only features needed for the monitoring decision.
Keep learning
- •The complete AI Agents Roadmap — my full 8-step breakdown
- •Free: The AI Agent Starter Kit — PDF checklist + starter code
- •Work with me — I build AI for banks and insurance companies
By Cyprian Aarons, AI Consultant at Topiax.
Want the complete 8-step roadmap?
Grab the free AI Agent Starter Kit — architecture templates, compliance checklists, and a 7-email deep-dive course.
Get the Starter Kit