How to Build a policy Q&A Agent Using LlamaIndex in Python for banking
A policy Q&A agent for banking answers staff or customer questions from approved policy documents, procedures, and internal controls. It matters because the cost of a wrong answer is not just a bad user experience; it can become a compliance issue, an audit finding, or a customer harm event.
Architecture
- •
Document ingestion layer
- •Pulls policy PDFs, Word docs, and internal wiki pages into a controlled corpus.
- •Filters by business unit, jurisdiction, and document version.
- •
Chunking and indexing layer
- •Splits policy content into retrievable nodes.
- •Builds a vector index for semantic search and, ideally, a keyword index for exact clause lookup.
- •
Retriever
- •Finds the most relevant policy sections for a question.
- •Uses top-k retrieval with metadata filters like region, product line, and effective date.
- •
Response synthesizer
- •Turns retrieved evidence into a concise answer.
- •Forces citations so users can trace every answer back to source text.
- •
Guardrails layer
- •Blocks unsupported questions, low-confidence answers, and out-of-scope requests.
- •Routes sensitive cases to human review.
- •
Audit and observability
- •Logs question, retrieved sources, answer text, model version, and timestamp.
- •Supports compliance review and incident investigation.
Implementation
1) Load policy documents with metadata
For banking, metadata is not optional. You need to tag each document with jurisdiction, line of business, and version so retrieval does not mix UK retail policy with US commercial policy.
from pathlib import Path
from llama_index.core import SimpleDirectoryReader
docs = SimpleDirectoryReader(
input_dir="./policy_docs",
recursive=True,
).load_data()
for doc in docs:
doc.metadata.update({
"source_system": "policy_repo",
"business_unit": "retail_banking",
"jurisdiction": "uk",
"doc_type": "policy",
})
print(f"Loaded {len(docs)} documents")
2) Build the index
This pattern uses VectorStoreIndex for semantic retrieval. In production you would back it with a real vector store that meets your data residency requirements instead of keeping everything local.
from llama_index.core import VectorStoreIndex
index = VectorStoreIndex.from_documents(docs)
query_engine = index.as_query_engine(
similarity_top_k=5,
response_mode="compact"
)
If you need stricter control over exact wording from policies, add SummaryIndex or KeywordTableIndex alongside the vector index. For most banking Q&A use cases, vector retrieval plus citations is the starting point.
3) Ask questions with grounded answers
Use query_engine.query() for direct Q&A. Keep the prompt narrow: answer only from retrieved policy content and say when the policy does not cover the question.
question = "Can a retail customer waive paper statements online?"
response = query_engine.query(
f"""
Answer only using the provided policy context.
If the policy does not explicitly allow this action, say so.
Include source references in the answer.
Question: {question}
"""
)
print(str(response))
4) Add a stricter response path for sensitive banking use cases
For customer-facing or high-risk workflows, wrap retrieval in an explicit guardrail check. If confidence is low or no relevant context is found, route to manual review instead of generating a speculative answer.
def answer_policy_question(question: str):
response = query_engine.query(question)
# Basic gating: require evidence-backed response
source_nodes = getattr(response, "source_nodes", [])
if not source_nodes:
return {
"answer": "I could not find an approved policy source for this question.",
"status": "needs_review",
}
return {
"answer": str(response),
"status": "answered",
"sources": [
{
"score": node.score,
"text": node.node.get_text()[:300],
"metadata": node.node.metadata,
}
for node in source_nodes
],
}
result = answer_policy_question("What is the retention period for KYC records?")
print(result["status"])
print(result["answer"])
Production Considerations
- •
Data residency
- •Keep embeddings, indexes, and logs in-region where required by regulation.
- •Do not send confidential policy text to external services unless legal and security approvals are in place.
- •
Auditability
- •Log every query with retrieved document IDs, chunk IDs, model name, and response timestamp.
- •Store immutable traces so compliance teams can reconstruct why an answer was given.
- •
Guardrails
- •Block questions that ask for legal advice beyond published policy.
- •Require human approval for ambiguous requests like exceptions, overrides, or customer remediation decisions.
- •
Monitoring
- •Track retrieval hit rate, unanswered queries, citation coverage, and escalation volume.
- •Alert on spikes in “no source found” responses; that usually means stale content or broken ingestion.
Common Pitfalls
- •
Mixing document versions
- •A bank often has overlapping policies across regions and effective dates.
- •Avoid this by filtering on metadata like
jurisdiction,effective_date, andversionduring retrieval.
- •
Letting the model answer without evidence
- •If you do plain generation after retrieval failure, you will get confident nonsense.
- •Avoid this by requiring source nodes before returning an answer and escalating otherwise.
- •
Ignoring access control
- •Not every employee should see every policy document.
- •Enforce document-level permissions before indexing or at retrieval time so the agent never exposes restricted content.
- •
Skipping evaluation on real policy questions
- •Synthetic prompts miss edge cases like exceptions language or contradictory clauses.
- •Build a test set from actual banking FAQs and measure citation accuracy before rollout.
Keep learning
- •The complete AI Agents Roadmap — my full 8-step breakdown
- •Free: The AI Agent Starter Kit — PDF checklist + starter code
- •Work with me — I build AI for banks and insurance companies
By Cyprian Aarons, AI Consultant at Topiax.
Want the complete 8-step roadmap?
Grab the free AI Agent Starter Kit — architecture templates, compliance checklists, and a 7-email deep-dive course.
Get the Starter Kit