How to Build a underwriting Agent Using LangChain in Python for healthcare
An underwriting agent for healthcare reviews member or provider information, applies policy rules, and produces a decision recommendation with reasons and evidence. In practice, that means faster eligibility checks, fewer manual reviews, and a cleaner audit trail for why a claim, plan, or risk case was approved, flagged, or rejected.
Architecture
- •
Input normalization layer
- •Converts raw intake data from EHR exports, PDFs, CSVs, or API payloads into a consistent schema.
- •Keeps PHI fields explicit so you can redact or route them correctly.
- •
Document retrieval layer
- •Pulls policy docs, underwriting guidelines, CMS rules, and plan-specific exclusions.
- •Usually backed by a vector store plus keyword retrieval for exact policy language.
- •
Reasoning and decision layer
- •Uses an LLM chain to compare case facts against policy rules.
- •Outputs a structured recommendation: approve, deny, manual review, or request more info.
- •
Compliance guardrail layer
- •Enforces HIPAA-safe handling, audit logging, and “no decision without citation” behavior.
- •Blocks unsupported recommendations when evidence is missing.
- •
Human review handoff
- •Escalates edge cases to an underwriter with a summary of facts and cited policy references.
- •Critical for adverse actions and ambiguous medical necessity cases.
- •
Audit and observability layer
- •Stores prompts, retrieved documents, model outputs, timestamps, and final decisions.
- •Needed for traceability, model governance, and regulatory review.
Implementation
1) Define the case schema and load policy documents
Use pydantic to keep your input structured. Then load underwriting guidance into LangChain documents so the agent can cite policy text instead of guessing.
from pydantic import BaseModel
from langchain_core.documents import Document
from langchain_community.document_loaders import TextLoader
class UnderwritingCase(BaseModel):
member_id: str
age: int
diagnosis_codes: list[str]
medications: list[str]
requested_service: str
state: str
loader = TextLoader("data/healthcare_underwriting_policy.txt", encoding="utf-8")
policy_docs = loader.load()
case = UnderwritingCase(
member_id="M12345",
age=52,
diagnosis_codes=["E11.9", "I10"],
medications=["metformin", "lisinopril"],
requested_service="prior auth for advanced imaging",
state="TX",
)
2) Build retrieval over policy text
For healthcare underwriting you want exact retrieval on plan rules plus semantic search for related clauses. FAISS is fine for the first pass; swap in your enterprise vector store later if you need residency controls.
from langchain_openai import OpenAIEmbeddings
from langchain_community.vectorstores import FAISS
embeddings = OpenAIEmbeddings(model="text-embedding-3-small")
vectorstore = FAISS.from_documents(policy_docs, embeddings)
retriever = vectorstore.as_retriever(search_kwargs={"k": 4})
relevant_docs = retriever.invoke(
f"{case.requested_service} {case.state} {case.diagnosis_codes}"
)
3) Create a structured underwriting chain
Use ChatOpenAI, ChatPromptTemplate, StrOutputParser, and RunnablePassthrough. The pattern below forces the model to return a concise decision with citations.
from langchain_openai import ChatOpenAI
from langchain_core.prompts import ChatPromptTemplate
from langchain_core.output_parsers import StrOutputParser
from langchain_core.runnables import RunnablePassthrough
llm = ChatOpenAI(model="gpt-4o-mini", temperature=0)
prompt = ChatPromptTemplate.from_messages([
("system",
"You are a healthcare underwriting assistant. "
"Use only the provided policy context. "
"If evidence is insufficient, return MANUAL_REVIEW. "
"Do not expose PHI beyond what is necessary."),
("human",
"Case:\n{case}\n\nPolicy context:\n{context}\n\n"
"Return:\n"
"- decision: APPROVE | DENY | MANUAL_REVIEW\n"
"- reason\n"
"- cited_policy_lines")
])
def format_docs(docs):
return "\n\n".join(
f"[Source {i+1}] {doc.page_content}" for i, doc in enumerate(docs)
)
chain = (
{
"case": RunnablePassthrough(),
"context": lambda x: format_docs(retriever.invoke(str(x)))
}
| prompt
| llm
| StrOutputParser()
)
result = chain.invoke(case.model_dump())
print(result)
This is the core pattern: retrieve policy context first, then ask the model to decide strictly from that context. For healthcare workflows this matters because you need defensible decisions tied to written policy.
4) Add a human-review gate for risky cases
Do not auto-finalize anything that touches adverse action thresholds or incomplete clinical evidence. Use simple rule checks before calling the LLM output final.
def needs_manual_review(case_data: dict) -> bool:
high_risk_states = {"CA", "NY"}
if case_data["state"] in high_risk_states:
return True
if len(case_data["diagnosis_codes"]) == 0:
return True
if case_data["age"] > 65:
return True
return False
case_data = case.model_dump()
if needs_manual_review(case_data):
final_decision = {
"decision": "MANUAL_REVIEW",
"reason": "Case meets escalation criteria before automated determination."
}
else:
final_decision = {"decision": result}
print(final_decision)
Production Considerations
- •
HIPAA controls
- •Encrypt data in transit and at rest.
- •Minimize PHI in prompts; send only what the model needs to decide.
- •Keep access logs tied to user identity and case ID.
- •
Data residency
- •If you operate across regions, pin embeddings storage and model inference to approved jurisdictions.
- •Don’t let policy docs or member data drift into unmanaged SaaS buckets.
- •
Auditability
- •Persist prompt inputs, retrieved documents, model version, timestamps, and final output.
- •Underwriters need to answer “why was this decided?” months later.
- •
Guardrails
- •Require citations from retrieved policy text before any approve/deny recommendation.
- •Route low-confidence or incomplete cases to manual review instead of forcing an answer.
Common Pitfalls
- •
Letting the LLM make unsupported decisions
- •Fix it by requiring retrieved policy context in every run.
- •If the context is thin or missing, force
MANUAL_REVIEW.
- •
Stuffing raw PHI into prompts
- •Fix it by normalizing inputs and redacting unnecessary identifiers.
- •Only include clinical details relevant to the underwriting rule being evaluated.
- •
Skipping versioning on policies
- •Fix it by storing document versions with effective dates.
- •A decision made under last quarter’s guideline should not silently use this quarter’s rule set.
Keep learning
- •The complete AI Agents Roadmap — my full 8-step breakdown
- •Free: The AI Agent Starter Kit — PDF checklist + starter code
- •Work with me — I build AI for banks and insurance companies
By Cyprian Aarons, AI Consultant at Topiax.
Want the complete 8-step roadmap?
Grab the free AI Agent Starter Kit — architecture templates, compliance checklists, and a 7-email deep-dive course.
Get the Starter Kit