How to Build a policy Q&A Agent Using LlamaIndex in Python for lending
A policy Q&A agent for lending answers questions like “Can we approve this borrower under our current DTI rules?” or “What documents are required for a self-employed applicant?” It matters because lending teams need fast, consistent answers grounded in policy, not memory or guesswork. If the agent cannot cite the exact policy source, you do not have a usable system.
Architecture
- •
Policy document store
- •Ingest underwriting guides, credit policy PDFs, SOPs, and exception matrices.
- •Keep versions so you can answer based on the policy in force at decision time.
- •
Document parsing and indexing layer
- •Use LlamaIndex loaders and chunking to turn PDFs and docs into retrievable nodes.
- •Store metadata like
policy_name,effective_date,jurisdiction, andversion.
- •
Retriever
- •Fetch the most relevant policy chunks for each user question.
- •In lending, retrieval needs to be strict enough to avoid mixing consumer mortgage rules with SME lending rules.
- •
LLM response layer
- •Generate an answer only from retrieved policy context.
- •Force citations so analysts can trace every answer back to source text.
- •
Guardrails and escalation
- •Detect low-confidence questions, missing policy coverage, or requests that look like credit decisions.
- •Escalate to compliance or underwriting instead of hallucinating.
- •
Audit logging
- •Persist question, retrieved nodes, model output, timestamps, user identity, and policy version.
- •This is non-negotiable for compliance review and dispute handling.
Implementation
1) Install dependencies and load your policy documents
For lending use cases, start with a folder of approved policy PDFs. Tag every file with metadata before indexing so you can filter by product line or jurisdiction later.
pip install llama-index llama-index-readers-file pypdf
from pathlib import Path
from llama_index.core import VectorStoreIndex, SimpleDirectoryReader
policy_dir = Path("./policies")
documents = SimpleDirectoryReader(
input_dir=str(policy_dir),
recursive=True,
).load_data()
print(f"Loaded {len(documents)} policy documents")
2) Build an index with metadata-aware chunks
Use SentenceSplitter through Settings so your chunks are small enough for precise retrieval. For lending policies, smaller chunks reduce the chance that the model blends different rules together.
from llama_index.core import Settings
from llama_index.core.node_parser import SentenceSplitter
from llama_index.embeddings.openai import OpenAIEmbedding
from llama_index.llms.openai import OpenAI
Settings.llm = OpenAI(model="gpt-4o-mini", temperature=0)
Settings.embed_model = OpenAIEmbedding(model="text-embedding-3-small")
Settings.node_parser = SentenceSplitter(chunk_size=512, chunk_overlap=80)
index = VectorStoreIndex.from_documents(documents)
If your policies are versioned by product or geography, attach metadata during ingestion and filter at query time. That keeps a UK mortgage analyst from seeing US consumer loan guidance.
3) Create a citation-first query engine
The default as_query_engine() pattern is enough to get started. Set a low temperature and ask for concise answers with sources; in lending, you want traceability over creativity.
query_engine = index.as_query_engine(
similarity_top_k=4,
response_mode="compact",
)
question = (
"What is the minimum documentation required for a self-employed "
"borrower applying for an unsecured personal loan?"
)
response = query_engine.query(question)
print("ANSWER:")
print(response.response)
print("\nSOURCES:")
for source in response.source_nodes:
print(f"- score={source.score:.3f} | {source.node.metadata}")
print(source.node.get_text()[:300], "\n")
This gives you two things you need in production: an answer and the evidence behind it. If the source nodes do not contain enough context, do not let the agent invent an answer.
4) Add a simple refusal path for out-of-policy questions
A lending agent should refuse to make credit decisions or provide legal advice. You can add a thin wrapper that checks whether retrieval confidence is too low before answering.
def answer_policy_question(question: str):
result = query_engine.query(question)
top_scores = [node.score or 0 for node in result.source_nodes]
best_score = max(top_scores) if top_scores else 0
if best_score < 0.25:
return {
"answer": "I could not find a matching policy section. Escalate to underwriting/compliance.",
"sources": [],
"needs_human_review": True,
}
return {
"answer": result.response,
"sources": [
{
"score": node.score,
"metadata": node.node.metadata,
"text": node.node.get_text()[:400],
}
for node in result.source_nodes
],
"needs_human_review": False,
}
print(answer_policy_question("Can we waive income verification for this applicant?"))
That pattern is simple but effective. It prevents the assistant from answering when retrieval is weak or when the policy corpus does not cover the request.
Production Considerations
- •
Deployment
- •Keep the index in a controlled environment with encrypted storage.
- •For regulated lending data, separate dev/test/prod indexes and restrict who can ingest new policy versions.
- •
Monitoring
- •Log every query with user ID, timestamp, retrieved node IDs, scores, and final answer.
- •Track unanswered questions and low-confidence responses; those usually indicate missing policies or bad chunking.
- •
Guardrails
- •Block prompts asking the agent to approve/deny loans, override underwriting rules, or ignore compliance requirements.
- •Add explicit refusal templates for legal interpretations and adverse action decisions.
- •
Data residency
- •Store embeddings and raw policy content in-region if your lending operation has jurisdictional constraints.
- •Do not send sensitive borrower files into the same index as public-facing policy docs unless access controls are enforced end-to-end.
Common Pitfalls
- •
Mixing policies across products
- •A personal loan rule set should not be retrieved alongside mortgage underwriting guidance.
- •Fix it by using metadata filters such as
product_type,jurisdiction, andeffective_date.
- •
Letting the model answer without evidence
- •If you skip citations, users will trust fluent nonsense.
- •Fix it by always returning source nodes and rejecting answers when retrieval confidence is low.
- •
Ignoring document versioning
- •Lending policies change often, especially around documentation thresholds and exception handling.
- •Fix it by storing effective dates and retiring old indexes or filtering them out at query time.
- •
Treating compliance as an afterthought
- •A Q&A agent that can explain policy but cannot prove provenance is risky in audits.
- •Fix it by logging prompts, retrieved passages, model outputs, and the exact policy version used for each response.
Keep learning
- •The complete AI Agents Roadmap — my full 8-step breakdown
- •Free: The AI Agent Starter Kit — PDF checklist + starter code
- •Work with me — I build AI for banks and insurance companies
By Cyprian Aarons, AI Consultant at Topiax.
Want the complete 8-step roadmap?
Grab the free AI Agent Starter Kit — architecture templates, compliance checklists, and a 7-email deep-dive course.
Get the Starter Kit