How to Build a compliance checking Agent Using LangChain in Python for banking
A compliance checking agent for banking reviews customer messages, internal notes, transactions, or generated responses against policy and regulatory rules before anything is sent or approved. It matters because in banking, a bad answer is not just a UX issue — it can create regulatory exposure, audit failures, and customer harm.
Architecture
- •
Input adapter
- •Pulls text from CRM notes, chat transcripts, email drafts, or transaction descriptions.
- •Normalizes the payload into a single schema the agent can inspect.
- •
Policy retrieval layer
- •Stores bank policies, product rules, KYC/AML guidance, and regional compliance documents.
- •Uses
Chromaor another vector store withRetrievalQA-style retrieval patterns.
- •
Compliance reasoning chain
- •Uses LangChain’s
ChatPromptTemplate,RunnablePassthrough, and an LLM to classify risk and explain why. - •Produces structured outputs like
approve,reject, orneeds_review.
- •Uses LangChain’s
- •
Deterministic rule engine
- •Handles hard rules that should never depend on model judgment.
- •Examples: prohibited phrases, missing disclosures, sanctioned-country mentions, PII leakage.
- •
Audit logger
- •Persists input, retrieved policy snippets, model output, timestamp, and reviewer decision.
- •Required for traceability in regulated environments.
- •
Human escalation path
- •Routes ambiguous cases to a compliance analyst.
- •Keeps the agent advisory instead of making final decisions on high-risk items.
Implementation
1) Install dependencies and load policy documents
Use LangChain’s current split packages. Keep your policy text local if you have residency constraints.
pip install langchain langchain-openai langchain-community chromadb pydantic
from langchain_community.document_loaders import TextLoader
from langchain_text_splitters import RecursiveCharacterTextSplitter
loader = TextLoader("bank_policy.txt", encoding="utf-8")
docs = loader.load()
splitter = RecursiveCharacterTextSplitter(chunk_size=800, chunk_overlap=120)
chunks = splitter.split_documents(docs)
2) Build a policy retriever
This agent should ground every decision in bank-approved text. A vector retriever gives you relevant policy passages for each check.
from langchain_openai import OpenAIEmbeddings
from langchain_community.vectorstores import Chroma
embeddings = OpenAIEmbeddings(model="text-embedding-3-small")
vectorstore = Chroma.from_documents(
documents=chunks,
embedding=embeddings,
persist_directory="./chroma_bank_policy"
)
retriever = vectorstore.as_retriever(search_kwargs={"k": 4})
3) Define the compliance chain with structured output
For banking workflows, plain text answers are weak. Use a schema so downstream systems can route decisions reliably.
from typing import Literal
from pydantic import BaseModel, Field
from langchain_openai import ChatOpenAI
from langchain_core.prompts import ChatPromptTemplate
from langchain_core.runnables import RunnablePassthrough
class ComplianceResult(BaseModel):
decision: Literal["approve", "reject", "needs_review"] = Field(...)
risk_level: Literal["low", "medium", "high"] = Field(...)
rationale: str = Field(...)
policy_refs: list[str] = Field(default_factory=list)
llm = ChatOpenAI(model="gpt-4o-mini", temperature=0)
structured_llm = llm.with_structured_output(ComplianceResult)
prompt = ChatPromptTemplate.from_messages([
("system",
"You are a banking compliance checker. "
"Use only the provided policy context. "
"If the request conflicts with policy or lacks enough evidence, mark needs_review."),
("human",
"Request:\n{request}\n\nPolicy context:\n{context}")
])
def format_docs(docs):
return "\n\n".join(
f"[Source {i+1}] {doc.page_content}" for i, doc in enumerate(docs)
)
chain = (
{
"request": RunnablePassthrough(),
"context": retriever | format_docs,
}
| prompt
| structured_llm
)
4) Add deterministic checks before the LLM verdict
Do not let the model decide obvious violations like PII leakage or sanctioned-country mentions. Put hard gates in front of the chain.
import re
BLOCKLIST_PATTERNS = [
r"\baccount number\b",
r"\bssn\b",
r"\bsanctioned country\b",
]
def hard_fail_checks(text: str):
lowered = text.lower()
for pattern in BLOCKLIST_PATTERNS:
if re.search(pattern, lowered):
return {
"decision": "reject",
"risk_level": "high",
"rationale": f"Blocked by deterministic rule: {pattern}",
"policy_refs": ["internal_hard_rules"]
}
return None
def check_compliance(request_text: str):
blocked = hard_fail_checks(request_text)
if blocked:
return blocked
result = chain.invoke(request_text)
return result.model_dump()
sample_request = (
"Draft a response telling the customer we can ignore KYC because they are VIP."
)
print(check_compliance(sample_request))
That pattern gives you two layers of control:
- •Hard rejection for non-negotiable violations.
- •Model-based review for nuanced cases where policy interpretation matters.
Production Considerations
- •
Keep data residency explicit
- •If policies or customer data must stay in-region, pin your vector store and LLM endpoints to approved jurisdictions.
- •Do not send raw customer records to external services unless legal review has cleared it.
- •
Log every decision path
- •Store the original request, retrieved chunks, final decision, model version, prompt version, and timestamp.
- •In audits, you need to show why the agent rejected or escalated something.
- •
Add confidence-based escalation
- •If retrieval returns weak matches or the model says
needs_review, route to a human analyst. - •Banking teams should approve edge cases manually instead of forcing an automated answer.
- •If retrieval returns weak matches or the model says
- •
Monitor drift in policies
- •Compliance rules change often. Rebuild embeddings when policies update and track document versioning.
- •Old policy chunks in your retriever are a silent failure mode.
Common Pitfalls
- •
Using the LLM as the only control
- •Bad idea for banking.
- •Fix it by enforcing deterministic checks for prohibited content and using the LLM only for interpretation and explanation.
- •
Skipping audit metadata
- •If you cannot reconstruct what the agent saw and why it decided something, you will fail internal review.
- •Fix it by persisting request text hashes, retrieved sources, prompt versions, and outputs.
- •
Letting retrieval pull irrelevant policy text
- •Weak retrieval leads to confident but wrong answers.
- •Fix it by tightening chunk size, improving metadata filters by product or region, and validating retrieval quality with test cases.
- •
Ignoring jurisdiction-specific rules
- •A rule that is valid in one country may be wrong in another.
- •Fix it by tagging documents with region metadata and routing requests through region-aware retrievers before inference.
Keep learning
- •The complete AI Agents Roadmap — my full 8-step breakdown
- •Free: The AI Agent Starter Kit — PDF checklist + starter code
- •Work with me — I build AI for banks and insurance companies
By Cyprian Aarons, AI Consultant at Topiax.
Want the complete 8-step roadmap?
Grab the free AI Agent Starter Kit — architecture templates, compliance checklists, and a 7-email deep-dive course.
Get the Starter Kit