How to Build a underwriting Agent Using LlamaIndex in Python for lending

By Cyprian AaronsUpdated 2026-04-21
underwritingllamaindexpythonlending

A underwriting agent for lending takes borrower data, pulls supporting documents, evaluates policy rules, and produces a structured credit recommendation with reasons. It matters because lending decisions need to be consistent, auditable, and fast enough for operations without turning every file into a manual review.

Architecture

A practical underwriting agent for lending needs these pieces:

  • Document ingestion layer

    • Loads bank statements, pay stubs, tax returns, IDs, and application forms.
    • Use SimpleDirectoryReader or a custom reader if documents come from S3, SharePoint, or an internal DMS.
  • Indexing layer

    • Stores and retrieves the borrower packet with VectorStoreIndex.
    • Keeps policy manuals, underwriting playbooks, and product rules in a separate index.
  • Policy retrieval layer

    • Uses as_query_engine() over the policy index to answer questions like:
      • “What is the minimum DSCR for this product?”
      • “What docs are required for self-employed applicants?”
  • Decision engine

    • Combines retrieved facts with hard rules.
    • Produces outputs like approve, refer, or decline plus reasons and missing evidence.
  • Audit and trace layer

    • Captures retrieved chunks, model output, and final decision.
    • Required for compliance reviews, adverse action support, and internal QA.
  • Guardrail layer

    • Prevents the model from inventing income, assets, or employment history.
    • Forces citations back to source documents before any recommendation is emitted.

Implementation

1) Load borrower documents and build an index

Start by loading the loan file into LlamaIndex. In production, your loader will likely point at object storage or a document system. The pattern below works with local files first so you can validate retrieval quality before wiring infrastructure.

from llama_index.core import SimpleDirectoryReader, VectorStoreIndex

# Borrower packet: bank statements, pay stubs, tax returns
docs = SimpleDirectoryReader(
    input_dir="./borrower_packet",
    recursive=True
).load_data()

borrower_index = VectorStoreIndex.from_documents(docs)
borrower_query_engine = borrower_index.as_query_engine(similarity_top_k=5)

That gives you semantic retrieval over the borrower file. For lending workflows, keep the borrower packet separate from policy content so you can explain what came from the file versus what came from underwriting rules.

2) Build a policy index for underwriting rules

Your agent should not rely on prompt memory for policy. Put product rules, exception guidelines, and compliance notes into their own index so retrieval is deterministic and auditable.

from llama_index.core import SimpleDirectoryReader, VectorStoreIndex

policy_docs = SimpleDirectoryReader(
    input_dir="./underwriting_policy",
    recursive=True
).load_data()

policy_index = VectorStoreIndex.from_documents(policy_docs)
policy_query_engine = policy_index.as_query_engine(similarity_top_k=3)

rule_response = policy_query_engine.query(
    "What are the minimum income verification requirements for salaried applicants?"
)
print(rule_response)

This separation matters. If a regulator asks why a loan was referred or declined, you need to show the exact rule source used at decision time.

3) Create an underwriting function that combines evidence and rules

The agent should not “decide” from raw generation alone. Retrieve evidence first, then ask the LLM to produce a structured assessment constrained by that evidence.

from llama_index.core import Settings
from llama_index.llms.openai import OpenAI

Settings.llm = OpenAI(model="gpt-4o-mini", temperature=0)

def underwrite_application(applicant_name: str) -> str:
    borrower_context = borrower_query_engine.query(
        f"Summarize verified income, employment status, debt obligations, "
        f"and any red flags for {applicant_name}."
    )

    policy_context = policy_query_engine.query(
        "Summarize the decision criteria for approval versus referral."
    )

    prompt = f"""
You are an underwriting assistant for lending.
Use only the provided evidence.

Borrower evidence:
{borrower_context}

Policy evidence:
{policy_context}

Return:
- Decision: approve | refer | decline
- Reasons: bullet list
- Missing evidence: bullet list
- Compliance note: one sentence
"""

    response = Settings.llm.complete(prompt)
    return str(response)

print(underwrite_application("Jane Doe"))

This is the core pattern. Retrieval constrains generation; generation formats the result into something ops teams can review quickly.

4) Add citations and keep an audit trail

For lending, you want traceability down to retrieved chunks. At minimum store query text, retrieved sources, model output, timestamp, and final human override if one exists.

import json
from datetime import datetime

result = borrower_query_engine.query("What is the applicant's monthly gross income?")
audit_record = {
    "timestamp": datetime.utcnow().isoformat(),
    "query": "What is the applicant's monthly gross income?",
    "answer": str(result),
}

with open("underwriting_audit.jsonl", "a") as f:
    f.write(json.dumps(audit_record) + "\n")

If you need stronger traceability later, move this into your event pipeline or observability stack. The important part is that every recommendation can be reconstructed from inputs and retrieved evidence.

Production Considerations

  • Keep data residency explicit

    • If borrower data must stay in-region, pin your vector store and LLM endpoint to that region.
    • Do not send PII to non-approved SaaS endpoints without legal signoff.
  • Separate decisioning from explanation

    • Use hard-coded business rules for thresholds like DTI caps or minimum documentation.
    • Let LlamaIndex handle retrieval and summarization; do not let it be the sole source of truth for approvals.
  • Log every retrieval path

    • Store top-k chunks returned by VectorStoreIndex.as_query_engine().
    • You need this for adverse action notices, QC sampling, and model governance reviews.
  • Add refusal behavior

    • If required fields are missing or conflicting across documents, force refer.
    • In lending, uncertainty should not become an auto-decision.

Common Pitfalls

  1. Mixing policy docs with borrower docs in one index

    • This causes irrelevant retrieval and weak explanations.
    • Keep separate indexes: one for facts about the applicant, one for underwriting rules.
  2. Letting the model infer missing financial facts

    • A model may guess income trends or employment continuity from partial data.
    • Avoid this by prompting it to cite only retrieved evidence and return missing evidence when data is absent.
  3. Skipping auditability

    • A clean final answer is not enough in lending.
    • Persist queries, retrieved passages, timestamps, model version, and final outcome so compliance can replay the decision later.
  4. Using temperature > 0 in decision workflows

    • Non-deterministic outputs create inconsistent recommendations across identical files.
    • Keep underwriting prompts at temperature=0 unless you are generating non-decision summaries.

Keep learning

By Cyprian Aarons, AI Consultant at Topiax.

Want the complete 8-step roadmap?

Grab the free AI Agent Starter Kit — architecture templates, compliance checklists, and a 7-email deep-dive course.

Get the Starter Kit

Related Guides