How to Build a underwriting Agent Using LangChain in Python for lending

By Cyprian AaronsUpdated 2026-04-21

underwritinglangchainpythonlending

An underwriting agent for lending takes borrower data, pulls in policy rules, evaluates risk signals, and returns a decision recommendation with an explanation trail. It matters because lending decisions need to be consistent, auditable, and fast enough to support application throughput without turning every case into a manual review.

Architecture

•
Input normalizer
- •Converts raw application payloads into a stable schema.
- •Validates required fields like income, DTI, credit score, loan amount, and jurisdiction.
•
Policy/rules layer
- •Encodes hard constraints such as minimum credit score, max debt-to-income ratio, and prohibited geographies.
- •Keeps deterministic rules outside the model so compliance can sign off on them.
•
LLM reasoning layer
- •Uses LangChain to summarize risk factors and generate a recommendation.
- •Should never be the only source of truth for approval or denial.
•
Retrieval layer
- •Pulls product policy, underwriting guidelines, and regulatory notes from approved documents.
- •Keeps the agent grounded in current lending policy.
•
Decision engine
- •Combines rules + retrieved policy + model output into approve / refer / decline.
- •Produces structured output for downstream systems.
•
Audit logger
- •Stores inputs, retrieved sources, model outputs, and final decision.
- •Required for adverse action support and internal review.

Implementation

1) Define the underwriting schema and deterministic checks

Keep the application contract explicit. In lending, loose JSON blobs become audit problems fast.

from typing import Literal
from pydantic import BaseModel, Field

class LoanApplication(BaseModel):
    applicant_id: str
    state: str
    annual_income: float = Field(gt=0)
    monthly_debt: float = Field(ge=0)
    requested_amount: float = Field(gt=0)
    credit_score: int = Field(ge=300, le=850)
    employment_months: int = Field(ge=0)

class UnderwritingDecision(BaseModel):
    decision: Literal["approve", "refer", "decline"]
    reason: str
    risk_flags: list[str]

def hard_rules(app: LoanApplication) -> list[str]:
    flags = []
    dti = app.monthly_debt * 12 / app.annual_income
    if app.credit_score < 620:
        flags.append("credit_score_below_minimum")
    if dti > 0.45:
        flags.append("debt_to_income_above_threshold")
    if app.employment_months < 6:
        flags.append("insufficient_employment_history")
    return flags

2) Load underwriting policy documents with LangChain retrieval

Use PyPDFLoader or another approved loader for your policy docs, then index them with FAISS. For production lending workflows, keep the corpus small and controlled.

from langchain_community.document_loaders import TextLoader
from langchain_text_splitters import RecursiveCharacterTextSplitter
from langchain_community.embeddings import HuggingFaceEmbeddings
from langchain_community.vectorstores import FAISS

loader = TextLoader("underwriting_policy.txt", encoding="utf-8")
docs = loader.load()

splitter = RecursiveCharacterTextSplitter(chunk_size=800, chunk_overlap=120)
chunks = splitter.split_documents(docs)

embeddings = HuggingFaceEmbeddings(model_name="sentence-transformers/all-MiniLM-L6-v2")
vectorstore = FAISS.from_documents(chunks, embeddings)
retriever = vectorstore.as_retriever(search_kwargs={"k": 4})

3) Build the LangChain chain with structured output

The key pattern is: retrieve policy context, pass it to the LLM with a strict prompt, then force structured output using with_structured_output. That keeps the agent usable by downstream systems instead of returning free-form prose.

from langchain_core.prompts import ChatPromptTemplate
from langchain_openai import ChatOpenAI

llm = ChatOpenAI(model="gpt-4o-mini", temperature=0)

prompt = ChatPromptTemplate.from_messages([
    ("system",
     "You are an underwriting assistant for consumer lending. "
     "Use only the provided policy context and application data. "
     "Return a decision that is consistent with lending policy."),
    ("human",
     "Application:\n{application}\n\n"
     "Hard rule flags:\n{flags}\n\n"
     "Policy context:\n{context}\n\n"
     "Return approve/refer/decline with concise rationale.")
])

structured_llm = llm.with_structured_output(UnderwritingDecision)

def underwrite(app: LoanApplication) -> UnderwritingDecision:
    flags = hard_rules(app)
    docs = retriever.invoke(f"underwriting criteria for {app.state} consumer loan")
    context = "\n\n".join(d.page_content for d in docs)

    chain_input = {
        "application": app.model_dump(),
        "flags": flags,
        "context": context,
    }

    messages = prompt.format_messages(**chain_input)
    return structured_llm.invoke(messages)

app = LoanApplication(
    applicant_id="A123",
    state="TX",
    annual_income=90000,
    monthly_debt=1200,
    requested_amount=15000,
    credit_score=680,
    employment_months=24,
)

result = underwrite(app)
print(result.model_dump())

4) Add an audit trail before you ship

For lending, log what the agent saw and why it decided. If you cannot reconstruct the decision later, you do not have a production underwriting system.

import json
from datetime import datetime

def audit_record(app: LoanApplication, decision: UnderwritingDecision, flags: list[str], retrieved_docs):
    record = {
        "timestamp": datetime.utcnow().isoformat(),
        "applicant_id": app.applicant_id,
        "input": app.model_dump(),
        "hard_rule_flags": flags,
        "retrieved_sources": [d.metadata for d in retrieved_docs],
        "decision": decision.model_dump(),
    }
    with open("underwriting_audit.log", "a", encoding="utf-8") as f:
        f.write(json.dumps(record) + "\n")

Production Considerations

•
Keep deterministic policy outside the model
- •Use code for hard declines and eligibility checks.
- •Let LangChain handle explanation and synthesis, not core compliance logic.
•
Track every retrieval source
- •Store document IDs, version numbers, and timestamps.
- •Lending teams need to know which guideline version drove the decision.
•
Control data residency
- •Do not send borrower PII to unmanaged endpoints.
- •Use region-bound infrastructure and approved model providers that match your regulatory footprint.
•
Add monitoring around drift and exceptions
- •Watch approval rates by segment, false refer rates, latency, and retrieval failures.
- •If one segment starts getting unusual outcomes, stop auto-decisioning and route to manual review.

Common Pitfalls

•
Using the LLM as the final authority
- •Bad pattern: “model says approve” without rule checks.
- •Fix it by enforcing hard constraints in Python before calling the model.
•
Letting prompts drift from policy
- •Bad pattern: updating underwriting rules in a prompt nobody reviews.
- •Fix it by versioning policy docs and retrieving from an approved corpus only.
•
Returning unstructured text
- •Bad pattern: free-form paragraphs that downstream systems must parse.
- •Fix it by using with_structured_output() with a Pydantic schema like UnderwritingDecision.

If you build this way, LangChain becomes the orchestration layer around underwriting logic instead of replacing underwriting discipline. That is the right shape for lending systems that need speed without losing control over compliance and auditability.

Keep learning

•The complete AI Agents Roadmap — my full 8-step breakdown
•Free: The AI Agent Starter Kit — PDF checklist + starter code
•Work with me — I build AI for banks and insurance companies

By Cyprian Aarons, AI Consultant at Topiax.

ShareX / Twitter LinkedIn

Want the complete 8-step roadmap?

Grab the free AI Agent Starter Kit — architecture templates, compliance checklists, and a 7-email deep-dive course.

Get the Starter Kit