How to Build a loan approval Agent Using LlamaIndex in Python for fintech

By Cyprian AaronsUpdated 2026-04-21

loan-approvalllamaindexpythonfintech

A loan approval agent automates the first pass of credit decisioning: it gathers applicant data, retrieves policy and underwriting rules, checks the application against those rules, and returns a structured recommendation with reasons. For fintech, this matters because you want faster decisions without turning your lending policy into a black box or violating compliance requirements.

Architecture

•
Application intake layer
- •Accepts borrower data from your API, CRM, or loan origination system.
- •Normalizes fields like income, DTI, employment status, and requested amount.
•
Policy knowledge base
- •Stores underwriting rules, product criteria, compliance notes, and exception policies.
- •Usually built from PDFs, markdown docs, internal wiki pages, and SOPs.
•
Retrieval layer
- •Uses VectorStoreIndex and RetrieverQueryEngine to fetch the most relevant policy snippets.
- •Keeps the agent grounded in current lending policy instead of hallucinating.
•
Decision engine
- •Applies deterministic checks for hard rules.
- •Uses an LLM only for explanation and edge-case reasoning, not for raw approval math.
•
Audit and trace layer
- •Logs retrieved chunks, prompt inputs, outputs, and final decision.
- •Needed for model governance, dispute handling, and regulator review.
•
Guardrails layer
- •Blocks unsupported decisions, PII leakage, and policy violations.
- •Enforces human review when confidence is low or exceptions are detected.

Implementation

1) Load lending policy documents into a LlamaIndex index

Start with your underwriting docs. Keep this separate from applicant data so you can version policies independently and prove which rule set was used for a decision.

from llama_index.core import SimpleDirectoryReader, VectorStoreIndex
from llama_index.core.node_parser import SentenceSplitter

# Load internal lending policy docs
documents = SimpleDirectoryReader("./policy_docs").load_data()

# Split into retrieval-friendly chunks
splitter = SentenceSplitter(chunk_size=512, chunk_overlap=80)
nodes = splitter.get_nodes_from_documents(documents)

# Build the index
index = VectorStoreIndex(nodes)

# Persist if you want repeatable deployments
index.storage_context.persist(persist_dir="./storage/policy_index")

This gives you a searchable policy index. In production, swap the default storage backend for your approved vector store if you need regional residency controls.

2) Create a retriever-backed query engine for underwriting context

You do not want the LLM guessing what your credit policy says. Use retrieval to pull the exact sections that matter for each application.

from llama_index.core import StorageContext, load_index_from_storage

storage_context = StorageContext.from_defaults(persist_dir="./storage/policy_index")
loaded_index = load_index_from_storage(storage_context)

query_engine = loaded_index.as_query_engine(similarity_top_k=3)

question = (
    "What are the eligibility rules for an unsecured personal loan "
    "for an applicant with monthly income of $6,000 and DTI of 38%?"
)

response = query_engine.query(question)
print(response)

At this point you have a grounded Q&A layer over your lending policy. For loan approval flows, use this to retrieve the applicable rules before making any decision.

3) Add deterministic checks before calling the model

Hard underwriting rules should be code, not prompts. If income thresholds or DTI limits are explicit in policy, calculate them directly and use the LLM only to explain the result.

from dataclasses import dataclass

@dataclass
class LoanApplication:
    applicant_id: str
    monthly_income: float
    monthly_debt: float
    requested_amount: float
    employment_months: int

def calculate_dti(app: LoanApplication) -> float:
    return (app.monthly_debt / app.monthly_income) * 100.0

def hard_rule_check(app: LoanApplication) -> tuple[bool, list[str]]:
    reasons = []
    dti = calculate_dti(app)

    if app.monthly_income < 3000:
        reasons.append("Income below minimum threshold")
    if dti > 40:
        reasons.append(f"DTI too high at {dti:.1f}%")
    if app.employment_months < 12:
        reasons.append("Employment history below minimum requirement")

    return len(reasons) == 0, reasons

app = LoanApplication(
    applicant_id="A-10021",
    monthly_income=6000,
    monthly_debt=2280,
    requested_amount=12000,
    employment_months=18,
)

approved_by_rules, rule_reasons = hard_rule_check(app)
print(approved_by_rules, rule_reasons)

This pattern is important in fintech. It prevents the model from approving loans based on vague language when your actual policy requires strict thresholds.

4) Use an LLM only for structured explanation and exception handling

Once deterministic checks run, ask the model to summarize the decision using retrieved policy context. Keep output structured so downstream systems can store it cleanly.

from llama_index.core.llms import ChatMessage
from llama_index.llms.openai import OpenAI

llm = OpenAI(model="gpt-4o-mini")

policy_context = response.response if hasattr(response, "response") else str(response)

messages = [
    ChatMessage(
        role="system",
        content=(
            "You are a loan decision assistant. "
            "Use only provided policy context and application facts. "
            "Return JSON with fields: decision, reason_codes, summary."
        ),
    ),
    ChatMessage(
        role="user",
        content=f"""
Application:
- applicant_id: {app.applicant_id}
- monthly_income: {app.monthly_income}
- monthly_debt: {app.monthly_debt}
- requested_amount: {app.requested_amount}
- employment_months: {app.employment_months}

Hard rule result:
- approved_by_rules: {approved_by_rules}
- rule_reasons: {rule_reasons}

Policy context:
{policy_context}
""",
    ),
]

chat_response = llm.chat(messages)
print(chat_response.message.content)

In production you would parse that JSON response and map it to an internal decision object. If approved_by_rules is false or confidence is low, route to manual review instead of auto-decisioning.

Production Considerations

•
Compliance logging
- •Persist application inputs, retrieved policy chunks, model version, prompt text, and final output.
- •This is what you need for auditability under lending governance and adverse action reviews.
•
Data residency
- •Keep applicant PII inside approved regions and approved storage.
- •If your lender operates across jurisdictions, separate indexes by region so policy retrieval does not cross borders.
•
Guardrails
- •Never let the LLM override hard underwriting rules.
- •Add schema validation on outputs and reject anything that is not valid JSON or contains unsupported fields like race or religion.
•
Monitoring
- •Track approval rate drift by segment, retrieval hit quality, manual review rate, and exception frequency.
- •Sudden changes usually mean either policy drift or bad document ingestion.

Common Pitfalls

•
Using the LLM as the source of truth
- •Mistake: asking the model to decide approval directly from applicant data.
- •Fix: encode hard rules in Python first; use LlamaIndex for retrieval plus explanation only.
•
Mixing stale policies with current applications
- •Mistake: indexing old underwriting docs without version control.
- •Fix: tag every document with effective dates and rebuild indexes when policy changes.
•
Ignoring explainability requirements
- •Mistake: returning “approved” or “denied” without reason codes.
- •Fix: always emit structured outputs with rule references so ops teams can support disputes and compliance reviews.

Keep learning

•The complete AI Agents Roadmap — my full 8-step breakdown
•Free: The AI Agent Starter Kit — PDF checklist + starter code
•Work with me — I build AI for banks and insurance companies

By Cyprian Aarons, AI Consultant at Topiax.

ShareX / Twitter LinkedIn

Want the complete 8-step roadmap?

Grab the free AI Agent Starter Kit — architecture templates, compliance checklists, and a 7-email deep-dive course.

Get the Starter Kit