How to Build a underwriting Agent Using LangChain in Python for insurance

By Cyprian AaronsUpdated 2026-04-21
underwritinglangchainpythoninsurance

An underwriting agent automates the first pass of insurance risk evaluation. It ingests applicant data, checks policy rules, flags missing evidence, and produces a structured recommendation that an underwriter can review and approve.

For insurance teams, this matters because underwriting is high-volume, rules-heavy, and audit-sensitive. A good agent reduces turnaround time without turning the decision into a black box.

Architecture

  • Input normalizer
    • Converts raw application payloads, PDFs, emails, and CRM fields into a clean structured schema.
  • Policy/rule retrieval layer
    • Pulls underwriting guidelines, appetite rules, exclusions, and referral thresholds from a controlled knowledge base.
  • LLM reasoning chain
    • Uses LangChain to summarize risk signals and map them to a recommendation: approve, refer, or decline.
  • Deterministic decision engine
    • Applies hard business rules outside the model for non-negotiables like age limits, geography restrictions, or mandatory disclosures.
  • Audit logger
    • Stores inputs, retrieved policy snippets, model output, and final decision for compliance review.
  • Human-in-the-loop handoff
    • Routes borderline cases to an underwriter with the evidence needed to make a final call.

Implementation

1. Define the underwriting schema

Start by forcing structure early. Insurance workflows fail when free-form text gets passed around without a schema.

from typing import List, Literal
from pydantic import BaseModel, Field

class UnderwritingApplication(BaseModel):
    applicant_name: str
    product_line: Literal["life", "health", "auto", "property", "commercial"]
    age: int
    state: str
    requested_coverage: float = Field(gt=0)
    prior_claims: int = Field(ge=0)
    notes: str

class UnderwritingDecision(BaseModel):
    recommendation: Literal["approve", "refer", "decline"]
    risk_level: Literal["low", "medium", "high"]
    rationale: List[str]
    missing_information: List[str]

This gives you typed inputs and typed outputs. In production, that structure is what makes downstream auditing and validation possible.

2. Load underwriting rules into retrieval

Use PyPDFLoader or another document loader for your guideline documents, then index them with FAISS. For this example, I’ll keep it simple and use in-memory documents.

from langchain_core.documents import Document
from langchain_community.vectorstores import FAISS
from langchain_openai import OpenAIEmbeddings

docs = [
    Document(page_content="Decline if applicant age is over 70 for term life products."),
    Document(page_content="Refer if prior claims are greater than 3 in the last 24 months."),
    Document(page_content="Require manual review for commercial property applications in high-risk flood zones."),
]

embeddings = OpenAIEmbeddings(model="text-embedding-3-small")
vectorstore = FAISS.from_documents(docs, embeddings)
retriever = vectorstore.as_retriever(search_kwargs={"k": 2})

This retrieval layer gives the model policy context instead of relying on memory. That is important when you need traceability for regulators and internal audit.

3. Build the LangChain underwriting chain

Use ChatOpenAI, ChatPromptTemplate, and PydanticOutputParser to produce a structured underwriting recommendation.

from langchain_openai import ChatOpenAI
from langchain_core.prompts import ChatPromptTemplate
from langchain_core.output_parsers import PydanticOutputParser
from langchain_core.runnables import RunnablePassthrough

parser = PydanticOutputParser(pydantic_object=UnderwritingDecision)

prompt = ChatPromptTemplate.from_messages([
    ("system",
     "You are an insurance underwriting assistant. "
     "Use only the provided policy context and application data. "
     "Do not invent facts. If information is missing, list it explicitly."),
    ("human",
     "Application:\n{application}\n\n"
     "Policy context:\n{policy_context}\n\n"
     "{format_instructions}")
]).partial(format_instructions=parser.get_format_instructions())

llm = ChatOpenAI(model="gpt-4o-mini", temperature=0)

def format_policy_context(question: str) -> str:
    return "\n".join([doc.page_content for doc in retriever.invoke(question)])

chain = (
    {
        "application": RunnablePassthrough(),
        "policy_context": lambda app: format_policy_context(app.model_dump_json()),
    }
    | prompt
    | llm
    | parser
)

The key pattern here is that the retriever feeds policy context into the prompt before generation. The temperature=0 setting keeps decisions stable across repeated runs.

4. Run deterministic checks before final output

Do not let the LLM override hard underwriting rules. Use code for rules that must never be probabilistic.

def hard_rules(app: UnderwritingApplication) -> UnderwritingDecision | None:
    if app.product_line == "life" and app.age > 70:
        return UnderwritingDecision(
            recommendation="decline",
            risk_level="high",
            rationale=["Age exceeds product eligibility limit."],
            missing_information=[]
        )
    if app.prior_claims > 3:
        return UnderwritingDecision(
            recommendation="refer",
            risk_level="high",
            rationale=["Prior claims exceed referral threshold."],
            missing_information=[]
        )
    return None

app = UnderwritingApplication(
    applicant_name="Jane Doe",
    product_line="life",
    age=54,
    state="NY",
    requested_coverage=250000,
    prior_claims=1,
    notes="Non-smoker. No adverse medical disclosures."
)

decision = hard_rules(app) or chain.invoke(app)
print(decision.model_dump())

This split is how you keep compliance teams happy. The model handles synthesis; code handles policy enforcement.

Production Considerations

  • Log every decision artifact
    • Store application payloads, retrieved policy docs, prompt version, model version, and final output in an immutable audit trail.
  • Keep data residency explicit
    • If your policies require regional storage, pin embeddings, vector stores, and inference endpoints to approved regions only.
  • Add guardrails for regulated language
    • Block unsupported claims like medical diagnosis or legal advice. The agent should recommend review, not impersonate an underwriter beyond its authority.
  • Monitor drift by product line
    • Track approval rates, referral rates, override rates by line of business and geography so you catch bad prompt changes or stale guideline content early.

Common Pitfalls

  • Letting the LLM make final eligibility decisions
    • Avoid this by putting non-negotiable rules in Python before the chain runs.
  • Skipping source citations in outputs
    • Every recommendation should include which guideline snippets influenced it. Without that, auditability collapses during review.
  • Using one generic prompt across all products
    • Life, auto, property, and commercial underwriting have different thresholds. Split prompts or chains per product line so the logic stays aligned with actual policy.

A production underwriting agent is not just a chat wrapper around policy documents. It is a controlled workflow with retrieval, deterministic rules, structured output, and auditability built in from day one.


Keep learning

By Cyprian Aarons, AI Consultant at Topiax.

Want the complete 8-step roadmap?

Grab the free AI Agent Starter Kit — architecture templates, compliance checklists, and a 7-email deep-dive course.

Get the Starter Kit

Related Guides