How to Build a policy Q&A Agent Using LlamaIndex in Python for payments
A policy Q&A agent for payments answers questions like “Can we refund this charge?”, “What’s the chargeback window for this card network?”, or “Is this merchant category allowed under our policy?” It matters because payment ops, support, and risk teams need fast answers that are consistent with policy, audit-friendly, and grounded in the latest internal docs instead of tribal knowledge.
Architecture
- •
Policy document store
- •Source of truth for payment policies: refunds, chargebacks, KYC/AML escalation, dispute handling, merchant restrictions.
- •Usually a mix of PDFs, Confluence exports, markdown docs, and ticket runbooks.
- •
Ingestion pipeline
- •Converts documents into LlamaIndex
Documentobjects. - •Splits them into chunks with
SentenceSplitterso retrieval works on precise policy sections.
- •Converts documents into LlamaIndex
- •
Vector index
- •Built with
VectorStoreIndex. - •Stores embeddings for semantic retrieval over policy text.
- •Built with
- •
Retriever
- •Uses
index.as_retriever(similarity_top_k=...). - •Pulls the most relevant policy chunks for each user question.
- •Uses
- •
Response synthesizer / query engine
- •Built with
index.as_query_engine(...). - •Generates grounded answers with citations so support agents can trace decisions back to source text.
- •Built with
- •
Guardrails layer
- •Blocks unsupported requests, forces escalation on ambiguous cases, and redacts sensitive data.
- •Important for PCI-adjacent workflows and internal compliance controls.
Implementation
- •Install dependencies and load your policy docs
Use LlamaIndex core components plus a local embedding model or your approved embedding provider. For payments teams, keep the document corpus scoped to approved regions if you have data residency requirements.
pip install llama-index llama-index-embeddings-openai pypdf
from pathlib import Path
from llama_index.core import VectorStoreIndex, SimpleDirectoryReader
from llama_index.core.node_parser import SentenceSplitter
from llama_index.core import Settings
from llama_index.embeddings.openai import OpenAIEmbedding
# Configure embedding model once for the app
Settings.embed_model = OpenAIEmbedding(model="text-embedding-3-small")
docs_path = Path("./payment_policies")
documents = SimpleDirectoryReader(
input_dir=str(docs_path),
recursive=True,
).load_data()
- •Chunk the policies and build the index
Payments policies are dense. If you chunk too large, retrieval gets noisy; too small and you lose context around exceptions and thresholds. SentenceSplitter is a good default because it preserves readable policy boundaries.
from llama_index.core.node_parser import SentenceSplitter
splitter = SentenceSplitter(chunk_size=800, chunk_overlap=120)
nodes = splitter.get_nodes_from_documents(documents)
index = VectorStoreIndex(nodes)
- •Expose a query engine that returns grounded answers
This is the main pattern: retrieve relevant policy nodes, synthesize an answer, and require citations. For support workflows, keep similarity_top_k low enough to avoid irrelevant policy bleed.
query_engine = index.as_query_engine(
similarity_top_k=4,
response_mode="compact",
)
questions = [
"What is our refund policy for duplicate card charges?",
"When do we escalate a disputed ACH debit?",
]
for q in questions:
response = query_engine.query(q)
print("\nQ:", q)
print("A:", response.response)
print("Sources:")
for source in response.source_nodes:
print("-", source.node.metadata.get("file_name", "unknown"), "|", source.score)
- •Add a lightweight payment-specific guardrail before answering
In production you should not answer everything. If a question asks for disallowed actions like bypassing KYC or exposing PAN data, route it to human review or a compliance workflow instead of generating an answer.
BLOCKLIST = [
"full card number",
"cvv",
"bypass kyc",
"ignore aml",
]
def should_block(question: str) -> bool:
q = question.lower()
return any(term in q for term in BLOCKLIST)
def answer_question(question: str):
if should_block(question):
return {
"answer": "This request requires compliance review.",
"escalate": True,
}
response = query_engine.query(question)
return {
"answer": response.response,
"escalate": False,
"sources": [
{
"file_name": sn.node.metadata.get("file_name", "unknown"),
"score": sn.score,
}
for sn in response.source_nodes
],
}
result = answer_question("Can we share the full card number with support?")
print(result)
Production Considerations
- •
Keep an audit trail
- •Log the user question, retrieved node IDs, document versions, answer text, and escalation outcome.
- •For payments operations, you need to prove why a decision was made during disputes or regulatory reviews.
- •
Respect data residency
- •If policies or examples contain regional customer data, keep embeddings and vector storage in-region.
- •Don’t move EU payment policy corpora into non-EU infrastructure without a legal basis and explicit controls.
- •
Add deterministic guardrails
- •Hard-block requests involving PAN/CVV handling, fraud evasion, sanctions evasion, or bypassing controls.
- •Route ambiguous refund/chargeback questions to humans when confidence is low or source coverage is thin.
- •
Monitor retrieval quality
- •Track top-k hit rate, citation coverage, unanswered questions, and escalation frequency.
- •In payments support flows, stale policy retrieval causes bad customer outcomes fast.
Common Pitfalls
- •
Using generic chunking on dense policy docs
- •A naive splitter can break apart exception clauses from their conditions.
- •Fix it by tuning
chunk_sizeandchunk_overlap, then validating retrieval against real payment scenarios like partial refunds or card-present disputes.
- •
Letting the agent answer outside its scope
- •If you don’t gate sensitive queries, users will ask it to explain fraud rules or reveal restricted data.
- •Fix it with explicit blocklists plus escalation logic tied to compliance and support queues.
- •
Skipping document versioning
- •Payment policies change often across networks, regions, and product lines.
- •Fix it by storing doc metadata such as version date, region, and owner in each node so every answer can be traced back to the exact policy revision.
Keep learning
- •The complete AI Agents Roadmap — my full 8-step breakdown
- •Free: The AI Agent Starter Kit — PDF checklist + starter code
- •Work with me — I build AI for banks and insurance companies
By Cyprian Aarons, AI Consultant at Topiax.
Want the complete 8-step roadmap?
Grab the free AI Agent Starter Kit — architecture templates, compliance checklists, and a 7-email deep-dive course.
Get the Starter Kit