How to Build a underwriting Agent Using LlamaIndex in Python for banking

By Cyprian AaronsUpdated 2026-04-21

underwritingllamaindexpythonbanking

A underwriting agent for banking takes a loan application, pulls the right policy and customer context, evaluates the request against credit and compliance rules, and returns a decision recommendation with evidence. It matters because underwriting is one of the few bank workflows where speed, consistency, and auditability all have to coexist.

Architecture

•
Document ingestion layer
- •Loads policy docs, product manuals, KYC rules, and underwriting playbooks.
- •In LlamaIndex, this is typically SimpleDirectoryReader plus chunking with SentenceSplitter.
•
Knowledge index
- •Stores bank-specific underwriting knowledge in a retrievable format.
- •Use VectorStoreIndex for semantic retrieval over policy text.
•
Retriever + query engine
- •Pulls only the relevant policy sections for a given application.
- •Use index.as_retriever() or index.as_query_engine().
•
Decision orchestration layer
- •Combines application data, retrieved policy evidence, and model output into a structured recommendation.
- •This is where you enforce deterministic checks before any LLM call.
•
Audit and trace logging
- •Captures retrieved sources, model response, timestamps, and decision rationale.
- •Required for internal audit, model risk management, and regulatory review.
•
Guardrails layer
- •Blocks unsupported decisions, PII leakage, and unsafe recommendations.
- •In banking, this should also enforce data residency and approved model endpoints.

Implementation

•Load underwriting policies and build an index

Start by indexing only approved internal documents. Do not mix in raw customer files with policy knowledge unless your access controls are already isolated per tenant or case.

from llama_index.core import SimpleDirectoryReader, VectorStoreIndex
from llama_index.core.node_parser import SentenceSplitter

# Load internal underwriting docs
documents = SimpleDirectoryReader(
    input_dir="./underwriting_policies",
    required_exts=[".pdf", ".txt", ".md"]
).load_data()

# Chunk documents for retrieval
splitter = SentenceSplitter(chunk_size=512, chunk_overlap=64)
nodes = splitter.get_nodes_from_documents(documents)

# Build the index
index = VectorStoreIndex(nodes)

# Create a retriever for policy lookup
retriever = index.as_retriever(similarity_top_k=3)

•Define the application payload and retrieve relevant policy context

Keep application data structured. Banks need deterministic pre-checks before any generative step: minimum income thresholds, DTI caps, bureau score floors, sanctions screening status, and residency constraints.

application = {
    "applicant_id": "CUST-10492",
    "loan_type": "personal_loan",
    "amount": 25000,
    "annual_income": 82000,
    "debt_to_income": 0.31,
    "credit_score": 712,
    "kyc_status": "verified",
    "sanctions_screened": True,
}

query = (
    f"Underwriting policy for {application['loan_type']} loans. "
    f"Need approval criteria for income {application['annual_income']}, "
    f"DTI {application['debt_to_income']}, credit score {application['credit_score']}."
)

retrieved_nodes = retriever.retrieve(query)
policy_context = "\n\n".join(
    [node.get_content() for node in retrieved_nodes]
)

•Run the agent with an LLM-backed query engine

Use LlamaIndex’s query engine to produce a recommendation grounded in the retrieved context. For production banking systems, keep the prompt narrow: ask for a recommendation plus reasons tied to policy excerpts.

from llama_index.llms.openai import OpenAI
from llama_index.core import Settings

Settings.llm = OpenAI(model="gpt-4o-mini", temperature=0)

query_engine = index.as_query_engine(
    similarity_top_k=3,
    response_mode="compact"
)

prompt = f"""
You are an underwriting assistant for a bank.

Application:
{application}

Policy context:
{policy_context}

Task:
Return one of: APPROVE, REVIEW_MANUALY, DECLINE.
Explain the decision using only the provided policy context.
If information is missing or inconsistent, return REVIEW_MANUALY.
"""

response = query_engine.query(prompt)
print(str(response))

•Add deterministic controls before finalizing the recommendation

The LLM should not be the source of truth for hard rules. If credit score is below threshold or sanctions screening failed, override the model output with a fixed decision path.

def hard_rule_decision(app):
    if not app["kyc_status"] == "verified":
        return "REVIEW_MANUAL", "KYC not verified"
    if not app["sanctions_screened"]:
        return "DECLINE", "Sanctions screening failed"
    if app["debt_to_income"] > 0.45:
        return "DECLINE", "DTI above threshold"
    if app["credit_score"] < 680:
        return "REVIEW_MANUAL", "Credit score below auto-approve threshold"
    return None

rule_decision = hard_rule_decision(application)

final_decision = {
    "applicant_id": application["applicant_id"],
    "decision": rule_decision[0] if rule_decision else str(response).split("\n")[0],
    "reason": rule_decision[1] if rule_decision else str(response),
}
print(final_decision)

Production Considerations

•
Deploy in a bank-approved environment
- •Keep models and vector stores inside approved cloud regions or on-prem infrastructure.
- •Data residency matters: underwriting files often contain regulated PII that cannot leave jurisdiction boundaries.
•
Log every retrieval path
- •Store retrieved node IDs, source document names, timestamps, prompt version, and final decision.
- •This is what audit teams will ask for when they challenge an approval or decline.
•
Separate deterministic rules from model reasoning
- •Hard compliance checks must run before the LLM.
- •The agent can explain decisions; it should not invent them.
•
Add guardrails around output format
- •Force structured responses like APPROVE, REVIEW_MANUAL, DECLINE.
- •Reject free-form answers that don’t include evidence from approved policy sources.

Common Pitfalls

•
Using the LLM as the underwriter
- •Mistake: letting the model decide based on its own judgment.
- •Fix: use rules engine checks first, then let LlamaIndex handle retrieval and explanation.
•
Indexing sensitive operational data without access boundaries
- •Mistake: mixing customer PII with general policy content in one shared index.
- •Fix: isolate indexes by tenant, product line, or case; apply row-level or document-level access control.
•
Skipping evidence capture
- •Mistake: returning a decision without source citations or retrieved context.
- •Fix: persist retrieved nodes and response metadata so compliance can trace every recommendation back to policy text.

A banking underwriting agent is useful when it shortens review time without weakening control. Build it as a retrieval system with strict rules around it, not as a chatbot that happens to know finance.

Keep learning

•The complete AI Agents Roadmap — my full 8-step breakdown
•Free: The AI Agent Starter Kit — PDF checklist + starter code
•Work with me — I build AI for banks and insurance companies

By Cyprian Aarons, AI Consultant at Topiax.

ShareX / Twitter LinkedIn

Want the complete 8-step roadmap?

Grab the free AI Agent Starter Kit — architecture templates, compliance checklists, and a 7-email deep-dive course.

Get the Starter Kit