How to Build a compliance checking Agent Using LangChain in Python for banking

By Cyprian AaronsUpdated 2026-04-21
compliance-checkinglangchainpythonbanking

A compliance checking agent for banking reviews customer messages, internal notes, transactions, or generated responses against policy and regulatory rules before anything is sent or approved. It matters because in banking, a bad answer is not just a UX issue — it can create regulatory exposure, audit failures, and customer harm.

Architecture

  • Input adapter

    • Pulls text from CRM notes, chat transcripts, email drafts, or transaction descriptions.
    • Normalizes the payload into a single schema the agent can inspect.
  • Policy retrieval layer

    • Stores bank policies, product rules, KYC/AML guidance, and regional compliance documents.
    • Uses Chroma or another vector store with RetrievalQA-style retrieval patterns.
  • Compliance reasoning chain

    • Uses LangChain’s ChatPromptTemplate, RunnablePassthrough, and an LLM to classify risk and explain why.
    • Produces structured outputs like approve, reject, or needs_review.
  • Deterministic rule engine

    • Handles hard rules that should never depend on model judgment.
    • Examples: prohibited phrases, missing disclosures, sanctioned-country mentions, PII leakage.
  • Audit logger

    • Persists input, retrieved policy snippets, model output, timestamp, and reviewer decision.
    • Required for traceability in regulated environments.
  • Human escalation path

    • Routes ambiguous cases to a compliance analyst.
    • Keeps the agent advisory instead of making final decisions on high-risk items.

Implementation

1) Install dependencies and load policy documents

Use LangChain’s current split packages. Keep your policy text local if you have residency constraints.

pip install langchain langchain-openai langchain-community chromadb pydantic
from langchain_community.document_loaders import TextLoader
from langchain_text_splitters import RecursiveCharacterTextSplitter

loader = TextLoader("bank_policy.txt", encoding="utf-8")
docs = loader.load()

splitter = RecursiveCharacterTextSplitter(chunk_size=800, chunk_overlap=120)
chunks = splitter.split_documents(docs)

2) Build a policy retriever

This agent should ground every decision in bank-approved text. A vector retriever gives you relevant policy passages for each check.

from langchain_openai import OpenAIEmbeddings
from langchain_community.vectorstores import Chroma

embeddings = OpenAIEmbeddings(model="text-embedding-3-small")

vectorstore = Chroma.from_documents(
    documents=chunks,
    embedding=embeddings,
    persist_directory="./chroma_bank_policy"
)

retriever = vectorstore.as_retriever(search_kwargs={"k": 4})

3) Define the compliance chain with structured output

For banking workflows, plain text answers are weak. Use a schema so downstream systems can route decisions reliably.

from typing import Literal
from pydantic import BaseModel, Field
from langchain_openai import ChatOpenAI
from langchain_core.prompts import ChatPromptTemplate
from langchain_core.runnables import RunnablePassthrough

class ComplianceResult(BaseModel):
    decision: Literal["approve", "reject", "needs_review"] = Field(...)
    risk_level: Literal["low", "medium", "high"] = Field(...)
    rationale: str = Field(...)
    policy_refs: list[str] = Field(default_factory=list)

llm = ChatOpenAI(model="gpt-4o-mini", temperature=0)
structured_llm = llm.with_structured_output(ComplianceResult)

prompt = ChatPromptTemplate.from_messages([
    ("system",
     "You are a banking compliance checker. "
     "Use only the provided policy context. "
     "If the request conflicts with policy or lacks enough evidence, mark needs_review."),
    ("human",
     "Request:\n{request}\n\nPolicy context:\n{context}")
])

def format_docs(docs):
    return "\n\n".join(
        f"[Source {i+1}] {doc.page_content}" for i, doc in enumerate(docs)
    )

chain = (
    {
        "request": RunnablePassthrough(),
        "context": retriever | format_docs,
    }
    | prompt
    | structured_llm
)

4) Add deterministic checks before the LLM verdict

Do not let the model decide obvious violations like PII leakage or sanctioned-country mentions. Put hard gates in front of the chain.

import re

BLOCKLIST_PATTERNS = [
    r"\baccount number\b",
    r"\bssn\b",
    r"\bsanctioned country\b",
]

def hard_fail_checks(text: str):
    lowered = text.lower()
    for pattern in BLOCKLIST_PATTERNS:
        if re.search(pattern, lowered):
            return {
                "decision": "reject",
                "risk_level": "high",
                "rationale": f"Blocked by deterministic rule: {pattern}",
                "policy_refs": ["internal_hard_rules"]
            }
    return None

def check_compliance(request_text: str):
    blocked = hard_fail_checks(request_text)
    if blocked:
        return blocked
    result = chain.invoke(request_text)
    return result.model_dump()

sample_request = (
    "Draft a response telling the customer we can ignore KYC because they are VIP."
)

print(check_compliance(sample_request))

That pattern gives you two layers of control:

  • Hard rejection for non-negotiable violations.
  • Model-based review for nuanced cases where policy interpretation matters.

Production Considerations

  • Keep data residency explicit

    • If policies or customer data must stay in-region, pin your vector store and LLM endpoints to approved jurisdictions.
    • Do not send raw customer records to external services unless legal review has cleared it.
  • Log every decision path

    • Store the original request, retrieved chunks, final decision, model version, prompt version, and timestamp.
    • In audits, you need to show why the agent rejected or escalated something.
  • Add confidence-based escalation

    • If retrieval returns weak matches or the model says needs_review, route to a human analyst.
    • Banking teams should approve edge cases manually instead of forcing an automated answer.
  • Monitor drift in policies

    • Compliance rules change often. Rebuild embeddings when policies update and track document versioning.
    • Old policy chunks in your retriever are a silent failure mode.

Common Pitfalls

  1. Using the LLM as the only control

    • Bad idea for banking.
    • Fix it by enforcing deterministic checks for prohibited content and using the LLM only for interpretation and explanation.
  2. Skipping audit metadata

    • If you cannot reconstruct what the agent saw and why it decided something, you will fail internal review.
    • Fix it by persisting request text hashes, retrieved sources, prompt versions, and outputs.
  3. Letting retrieval pull irrelevant policy text

    • Weak retrieval leads to confident but wrong answers.
    • Fix it by tightening chunk size, improving metadata filters by product or region, and validating retrieval quality with test cases.
  4. Ignoring jurisdiction-specific rules

    • A rule that is valid in one country may be wrong in another.
    • Fix it by tagging documents with region metadata and routing requests through region-aware retrievers before inference.

Keep learning

By Cyprian Aarons, AI Consultant at Topiax.

Want the complete 8-step roadmap?

Grab the free AI Agent Starter Kit — architecture templates, compliance checklists, and a 7-email deep-dive course.

Get the Starter Kit

Related Guides