How to Build a policy Q&A Agent Using LlamaIndex in Python for retail banking

By Cyprian AaronsUpdated 2026-04-21
policy-q-allamaindexpythonretail-bankingpolicy-qanda

A policy Q&A agent for retail banking answers employee or customer-facing questions from approved policy documents, procedure manuals, and product disclosures. It matters because policy interpretation is one of the biggest sources of inconsistent answers, compliance drift, and avoidable escalations across branches, contact centers, and digital channels.

Architecture

  • Document ingestion layer

    • Pulls policy PDFs, Word docs, and internal knowledge articles from a controlled source.
    • Normalizes content before indexing so retrieval stays stable across updates.
  • Vector index

    • Stores embeddings for semantic retrieval over policy text.
    • Use VectorStoreIndex from LlamaIndex for the main retrieval path.
  • Retriever

    • Fetches the top-k policy chunks relevant to a user question.
    • Use index.as_retriever(similarity_top_k=...) for controlled recall.
  • Response synthesizer / query engine

    • Generates an answer grounded in retrieved policy snippets.
    • Use index.as_query_engine(...) with a strict prompt style that cites sources.
  • Guardrails layer

    • Blocks disallowed requests like legal advice, account-specific decisions, or unsupported product claims.
    • Enforces “answer only from retrieved context” behavior.
  • Audit logging

    • Records question, retrieved document IDs, answer text, and timestamp.
    • Required for traceability in regulated banking environments.

Implementation

  1. Install LlamaIndex and load your policy documents

Use a local or private deployment path. For retail banking, do not point this at random public storage; keep policies in an approved repository with access control and retention rules.

pip install llama-index llama-index-core llama-index-embeddings-openai
from llama_index.core import SimpleDirectoryReader

# Load approved internal policy docs only
documents = SimpleDirectoryReader(
    input_dir="./bank_policies",
    recursive=True
).load_data()

print(f"Loaded {len(documents)} documents")
  1. Build a vector index over the policy corpus

This is the core retrieval layer. In production you would usually pair this with a private embedding model and a compliant vector store; the pattern below is the same.

from llama_index.core import VectorStoreIndex

index = VectorStoreIndex.from_documents(documents)
retriever = index.as_retriever(similarity_top_k=4)
  1. Create a grounded query engine with citations

For banking use cases, answers must stay tied to source text. Keep temperature low and force citation-style responses so reviewers can trace every statement back to policy.

from llama_index.core import Settings
from llama_index.core.llms import ChatMessage
from llama_index.core.query_engine import RetrieverQueryEngine

# If you are using OpenAI in dev, configure your API key externally.
# In production, swap this for your approved internal model endpoint.
query_engine = index.as_query_engine(
    similarity_top_k=4,
    response_mode="compact"
)

question = "Can we waive overdraft fees for a customer who called twice this month?"
response = query_engine.query(question)

print(str(response))
print("\nSources:")
for source in response.source_nodes:
    print("-", source.node.metadata.get("file_name", "unknown"), "| score:", source.score)
  1. Add a policy gate before answering

Retail banking needs hard boundaries. If the question asks for account-specific advice, regulatory interpretation beyond the document set, or anything outside scope, return a safe refusal and route to compliance or operations.

DISALLOWED_PATTERNS = [
    "legal advice",
    "guarantee approval",
    "override KYC",
    "bypass AML",
]

def is_allowed_question(question: str) -> bool:
    q = question.lower()
    return not any(p in q for p in DISALLOWED_PATTERNS)

def answer_policy_question(question: str):
    if not is_allowed_question(question):
        return {
            "answer": "I can only answer from approved bank policies. Please route this to Compliance or Operations.",
            "sources": []
        }

    result = query_engine.query(question)
    return {
        "answer": str(result),
        "sources": [
            {
                "file": node.node.metadata.get("file_name", "unknown"),
                "score": node.score,
            }
            for node in result.source_nodes
        ]
    }

print(answer_policy_question("What is the fee waiver rule for overdrafts?"))

Production Considerations

  • Data residency

    • Keep documents, embeddings, logs, and model inference inside approved regions.
    • For banks operating across jurisdictions, separate indexes by region if policies differ by country.
  • Auditability

    • Log the user question, top retrieved chunks, document IDs, answer version, and model version.
    • This is what lets compliance teams reconstruct why a given answer was produced.
  • Access control

    • Restrict ingestion to approved repositories and enforce role-based access on both retrieval and logs.
    • A retail banking policy agent should never expose internal procedures to unauthorized staff or customers.
  • Monitoring and drift detection

    • Track unanswered questions, low-retrieval-score queries, and escalation rates.
    • If policy updates land weekly but your index refreshes monthly, your agent will start giving stale answers fast.

Common Pitfalls

  • Letting the model answer without grounding

    • Mistake: prompting the LLM directly with no retrieval layer.
    • Fix: always route through VectorStoreIndex + retriever + query engine so answers come from approved docs.
  • Mixing customer data into the policy corpus

    • Mistake: indexing call transcripts or account notes alongside policies.
    • Fix: keep policy Q&A separate from customer-specific systems; otherwise you create privacy risk and bad retrieval behavior.
  • Ignoring document freshness

    • Mistake: building the index once and never refreshing it after policy changes.
    • Fix: rebuild or incrementally update the index on every controlled policy release.
  • No refusal path

    • Mistake: answering everything as if it were a simple FAQ.
    • Fix: add explicit guardrails for legal/regulatory questions and route exceptions to human review.

Keep learning

By Cyprian Aarons, AI Consultant at Topiax.

Want the complete 8-step roadmap?

Grab the free AI Agent Starter Kit — architecture templates, compliance checklists, and a 7-email deep-dive course.

Get the Starter Kit

Related Guides