How to Build a customer support Agent Using LlamaIndex in Python for lending
A customer support agent for lending answers borrower questions about applications, repayments, interest rates, due dates, payoff amounts, and document requirements. It matters because most lending support tickets are repetitive but high-risk: one wrong answer can create compliance issues, customer harm, or regulatory exposure.
Architecture
- •
Channel layer
- •Web chat, email, or internal ops console.
- •Normalizes the user message into a single request format.
- •
Retrieval layer
- •Pulls policy docs, product terms, FAQs, repayment rules, and servicing procedures from a controlled knowledge base.
- •Uses
VectorStoreIndexandRetrieverQueryEnginein LlamaIndex.
- •
Response generation layer
- •Produces the final answer using an LLM with grounded context.
- •Keeps responses short, factual, and tied to source documents.
- •
Guardrails layer
- •Blocks advice outside policy.
- •Detects PII leakage, unsupported financial advice, and requests that need human escalation.
- •
Audit and logging layer
- •Stores user query, retrieved sources, model response, and escalation reason.
- •Required for lending compliance reviews and dispute handling.
- •
Human handoff layer
- •Routes edge cases to a support agent when the question touches credit decisions, hardship claims, fraud, or legal complaints.
Implementation
1) Load lending support documents into a LlamaIndex index
Use a clean document set: product terms, repayment FAQ, collections policy, hardship policy, and servicing SOPs. Keep these documents versioned so every answer can be traced back to the exact policy revision.
from llama_index.core import SimpleDirectoryReader, VectorStoreIndex
# Example directory structure:
# ./data/lending_faq/
# ./data/lending_policy/
# ./data/servicing_docs/
documents = SimpleDirectoryReader(
input_dir="./data",
recursive=True
).load_data()
index = VectorStoreIndex.from_documents(documents)
This builds the retrieval backbone. In lending support, retrieval quality matters more than clever prompting because you want answers grounded in approved policy text.
2) Build a query engine with source citations
Use as_query_engine() so the agent can answer from indexed material. Set similarity_top_k low enough to reduce noise but high enough to capture policy exceptions.
from llama_index.core import Settings
from llama_index.core.llms import OpenAI
Settings.llm = OpenAI(model="gpt-4o-mini", temperature=0)
query_engine = index.as_query_engine(
similarity_top_k=4,
response_mode="compact"
)
question = "What happens if a borrower misses a payment?"
response = query_engine.query(question)
print(response.response)
for source in response.source_nodes:
print("SOURCE:", source.node.metadata.get("file_name"), source.score)
For lending support, always return citations. If a borrower disputes an answer later, you need the exact source document behind the response.
3) Add a router for safe escalation
Not every question should be answered by the model. Credit decisions, regulatory complaints, fraud claims, and legal threats should go to a human queue immediately.
from llama_index.core.tools import QueryEngineTool
from llama_index.core.agent import ReActAgent
from llama_index.core.llms import OpenAI
support_tool = QueryEngineTool.from_defaults(
query_engine=query_engine,
name="lending_support_knowledge_base",
description="Answers questions about loan servicing, repayments, due dates, fees, statements, and standard support policies."
)
llm = OpenAI(model="gpt-4o-mini", temperature=0)
agent = ReActAgent.from_tools(
tools=[support_tool],
llm=llm,
verbose=True,
system_prompt=(
"You are a lending customer support agent. "
"Only answer using retrieved policy content. "
"If asked about credit approval decisions, legal advice, fraud investigations, "
"or account-specific underwriting outcomes, escalate to a human."
),
)
print(agent.chat("Can I change my payment date?"))
This pattern keeps the agent useful without letting it improvise on regulated topics. The system prompt is not enough by itself; the tool boundary is what keeps scope tight.
4) Store audit logs for compliance review
Log every interaction with timestamps and source metadata. In lending environments this is not optional; you need traceability for complaints handling and internal audits.
import json
from datetime import datetime
def log_interaction(user_id: str, question: str, response_obj):
record = {
"timestamp": datetime.utcnow().isoformat(),
"user_id": user_id,
"question": question,
"answer": str(response_obj),
"sources": [
{
"file_name": n.node.metadata.get("file_name"),
"score": n.score
}
for n in response_obj.source_nodes
],
}
with open("audit_log.jsonl", "a") as f:
f.write(json.dumps(record) + "\n")
result = query_engine.query("How do I get my payoff amount?")
log_interaction("cust_10291", "How do I get my payoff amount?", result)
If your deployment spans regions or countries, store logs in the correct jurisdiction. Data residency requirements can apply to both customer data and derived artifacts like transcripts.
Production Considerations
- •
Deploy with strict document governance
- •Only index approved policy content.
- •Version documents and keep an immutable record of what was live at each point in time.
- •
Monitor for unsupported intent
- •Track queries about credit score impact, loan approval reasons, refinancing eligibility changes, disputes over fees, and hardship claims.
- •Escalate when confidence is low or retrieval returns weak matches.
- •
Add guardrails for regulated language
- •Block promises like “you will be approved” or “this fee will definitely be waived.”
- •Force the assistant to say what it knows from policy and what requires human review.
- •
Respect data residency and retention rules
- •Keep embeddings, logs, and transcripts in approved regions.
- •Set retention policies aligned with banking/consumer finance regulations and internal legal holds.
Common Pitfalls
- •
Indexing raw CRM notes without cleanup
- •CRM notes often contain PII, internal comments, or stale instructions.
- •Fix it by indexing only approved knowledge sources with redaction before ingestion.
- •
Letting the model answer account-specific questions from memory
- •Questions like “What is my current balance?” or “Was my payment posted?” need live systems.
- •Fix it by separating FAQ retrieval from authenticated account lookup tools and requiring explicit authorization checks.
- •
Skipping citation enforcement
- •If users cannot see where an answer came from, support becomes hard to audit.
- •Fix it by always returning
source_nodesmetadata and rejecting answers without supporting documents.
- •
Using one global prompt for all cases
- •Lending has different rules for servicing questions versus complaints versus hardship requests.
- •Fix it with routing logic that sends each intent to the right tool chain or human queue.
Keep learning
- •The complete AI Agents Roadmap — my full 8-step breakdown
- •Free: The AI Agent Starter Kit — PDF checklist + starter code
- •Work with me — I build AI for banks and insurance companies
By Cyprian Aarons, AI Consultant at Topiax.
Want the complete 8-step roadmap?
Grab the free AI Agent Starter Kit — architecture templates, compliance checklists, and a 7-email deep-dive course.
Get the Starter Kit