How to Build a compliance checking Agent Using LangChain in Python for insurance
A compliance checking agent for insurance reviews policy wording, claim notes, customer communications, and underwriting decisions against internal rules and regulatory requirements. It matters because insurance teams need fast, repeatable checks for things like unfair language, missing disclosures, data handling violations, and inconsistent claim handling before those issues become audit findings or customer complaints.
Architecture
- •
Document ingestion layer
- •Pulls policy docs, claims emails, call transcripts, and underwriting notes from approved sources.
- •Normalizes text into chunks with metadata like
policy_id,jurisdiction,line_of_business, andsource_system.
- •
Rules and guidance store
- •Holds insurer-specific compliance policies, state/regional regulations, and internal SOPs.
- •Backed by a vector store for semantic retrieval plus a small set of hard-coded rules for deterministic checks.
- •
Retriever
- •Uses
langchainretrievers to fetch the most relevant compliance clauses for the document under review. - •Must filter by jurisdiction and product line so the agent does not mix rules across regions.
- •Uses
- •
LLM reasoning chain
- •Compares the source document against retrieved guidance.
- •Produces structured findings: issue type, severity, evidence span, rule reference, and remediation.
- •
Audit logging layer
- •Stores prompts, retrieved passages, model outputs, timestamps, and human overrides.
- •Needed for model governance and regulator-ready traceability.
- •
Human review queue
- •Routes high-risk findings to a compliance analyst before any customer-facing action is taken.
- •Keeps the agent advisory, not autonomous.
Implementation
1) Load compliance documents into a vector store
Use a simple retrieval setup first. For insurance use cases, keep each chunk tagged with jurisdiction and document type so you can filter later.
from langchain_community.document_loaders import TextLoader
from langchain_text_splitters import RecursiveCharacterTextSplitter
from langchain_openai import OpenAIEmbeddings
from langchain_community.vectorstores import FAISS
loader = TextLoader("insurance_compliance_policy.txt", encoding="utf-8")
docs = loader.load()
splitter = RecursiveCharacterTextSplitter(chunk_size=800, chunk_overlap=120)
chunks = splitter.split_documents(docs)
for chunk in chunks:
chunk.metadata["jurisdiction"] = "US"
chunk.metadata["line_of_business"] = "life"
chunk.metadata["doc_type"] = "internal_policy"
embeddings = OpenAIEmbeddings(model="text-embedding-3-small")
vectorstore = FAISS.from_documents(chunks, embeddings)
retriever = vectorstore.as_retriever(search_kwargs={"k": 4})
2) Define a structured output schema
For compliance work, free-form text is not enough. Use a Pydantic model so downstream systems can route findings reliably.
from pydantic import BaseModel, Field
from typing import List
class ComplianceFinding(BaseModel):
issue: str = Field(description="Short description of the compliance issue")
severity: str = Field(description="low, medium, high")
evidence: str = Field(description="Exact excerpt from the source document")
rule_reference: str = Field(description="Referenced policy or regulation clause")
recommendation: str = Field(description="Action to fix the issue")
class ComplianceReport(BaseModel):
findings: List[ComplianceFinding]
overall_status: str = Field(description="pass, review_required, fail")
3) Build the LangChain retrieval + analysis chain
This pattern uses create_retrieval_chain with a prompt that forces grounded analysis. The model should only judge against retrieved context plus the submitted document.
from langchain_core.prompts import ChatPromptTemplate
from langchain_openai import ChatOpenAI
from langchain.chains.combine_documents import create_stuff_documents_chain
from langchain.chains.retrieval import create_retrieval_chain
llm = ChatOpenAI(model="gpt-4o-mini", temperature=0)
prompt = ChatPromptTemplate.from_messages([
("system",
"You are an insurance compliance reviewer. "
"Find violations only when supported by the provided context. "
"Return structured output matching the schema."),
("human",
"Document under review:\n{input}\n\n"
"Relevant compliance guidance:\n{context}\n\n"
"Check for disclosure issues, prohibited language, privacy violations,"
"and jurisdiction mismatches.")
])
combine_docs_chain = create_stuff_documents_chain(llm=llm, prompt=prompt)
rag_chain = create_retrieval_chain(retriever=retriever,
combine_docs_chain=combine_docs_chain)
sample_document = """
Customer communication draft:
We may share your medical details with partners to improve service.
Your claim is likely approved based on current notes.
"""
result = rag_chain.invoke({"input": sample_document})
print(result["answer"])
4) Add a deterministic guardrail before sending to the LLM
Insurance compliance has rules that do not need model interpretation. If a document contains restricted phrases or missing consent language, fail fast.
BLOCKLIST = [
"we may share your medical details",
"guaranteed approval",
"no questions asked"
]
def hard_fail_checks(text: str) -> list[str]:
hits = []
lowered = text.lower()
for phrase in BLOCKLIST:
if phrase in lowered:
hits.append(phrase)
return hits
violations = hard_fail_checks(sample_document)
if violations:
print({
"overall_status": "fail",
"findings": [
{
"issue": f"Blocked phrase detected: {v}",
"severity": "high",
"rule_reference": "Internal marketing and privacy policy",
"recommendation": "Remove phrase before release"
} for v in violations
]
})
else:
print(result["answer"])
Production Considerations
- •
Keep jurisdiction-aware retrieval
- •Filter documents by state or country before retrieval.
- •A life insurance disclosure rule in one region may be invalid in another.
- •
Log every decision path
- •Store input text hashes, retrieved passages, model version, prompt template version, and final verdict.
- •This is what auditors will ask for when they want to reconstruct why a finding was made.
- •
Control data residency
- •Do not send regulated customer data to endpoints outside approved regions.
- •If you process PHI-like medical details or sensitive underwriting data, use approved cloud regions and encryption at rest/in transit.
- •
Use human approval for high-severity findings
- •Anything that could block a claim payout or trigger adverse action should go to a reviewer.
- •The agent should recommend; it should not make final coverage decisions.
Common Pitfalls
- •
Mixing regulations across jurisdictions
- •Mistake: retrieving all compliance docs globally and letting the model decide what applies.
- •Avoid it by tagging every chunk with jurisdiction metadata and filtering at retrieval time.
- •
Letting the LLM invent rule references
- •Mistake: asking for citations without grounding in actual policy text.
- •Avoid it by forcing retrieval-based answers only and rejecting outputs that lack evidence spans from source documents.
- •
Treating unstructured output as production-ready
- •Mistake: parsing plain English summaries with regex downstream.
- •Avoid it by using structured schemas like
ComplianceReportand validating every response before routing it to case management systems.
Keep learning
- •The complete AI Agents Roadmap — my full 8-step breakdown
- •Free: The AI Agent Starter Kit — PDF checklist + starter code
- •Work with me — I build AI for banks and insurance companies
By Cyprian Aarons, AI Consultant at Topiax.
Want the complete 8-step roadmap?
Grab the free AI Agent Starter Kit — architecture templates, compliance checklists, and a 7-email deep-dive course.
Get the Starter Kit