How to Build a customer support Agent Using LlamaIndex in Python for payments

By Cyprian AaronsUpdated 2026-04-21

customer-supportllamaindexpythonpayments

A customer support agent for payments answers questions like “Where is my refund?”, “Why was my card declined?”, and “What does this charge descriptor mean?” without forcing a human to search through policy docs, ticket history, and processor logs. For payments teams, that matters because support quality is tied directly to trust, chargeback risk, compliance, and how fast you can resolve money-moving issues.

Architecture

•
User-facing chat layer
- •A web app, support console, or internal tool where agents and customers ask questions.
- •This layer should pass along account context, locale, and request metadata.
•
Retrieval layer with LlamaIndex
- •Use VectorStoreIndex for policy docs, FAQ content, dispute playbooks, and processor runbooks.
- •Add QueryEngine on top so the agent can answer from approved sources instead of guessing.
•
Structured payment context
- •Pull transaction status, refund state, dispute state, and merchant metadata from your payment backend.
- •Keep this data separate from untrusted documents and expose it through controlled tools.
•
Guardrails and policy filters
- •Block requests for full PANs, CVVs, secrets, or unsupported actions.
- •Enforce “no action without verification” rules for refunds, cancellations, or account changes.
•
Audit logging
- •Store the user question, retrieved sources, model response, tool calls, and final decision.
- •Payments teams need traceability for disputes, compliance reviews, and incident analysis.

Implementation

1) Index your approved support knowledge

Start with docs that are safe to expose: refund policy, settlement timelines, dispute handling steps, processor error code explanations. In production you would load these from S3, SharePoint, Confluence export files, or a document store.

from llama_index.core import SimpleDirectoryReader
from llama_index.core import VectorStoreIndex
from llama_index.core.settings import Settings
from llama_index.embeddings.openai import OpenAIEmbedding

# Configure embeddings once at startup
Settings.embed_model = OpenAIEmbedding(model="text-embedding-3-small")

# Load approved support documents
documents = SimpleDirectoryReader(
    input_dir="./payment_support_docs",
    required_exts=[".md", ".txt", ".pdf"],
).load_data()

# Build the retrieval index
index = VectorStoreIndex.from_documents(documents)

# Create a query engine for support Q&A
query_engine = index.as_query_engine(similarity_top_k=4)

This gives you a retrieval-backed answer path. The key point is that the agent should answer from curated material first, not from model memory.

2) Add a payment-aware system prompt

Payments support needs narrow behavior. The agent should explain policies, ask for missing non-sensitive identifiers, and refuse risky requests like exposing full card details.

from llama_index.core import Settings
from llama_index.llms.openai import OpenAI

Settings.llm = OpenAI(model="gpt-4o-mini", temperature=0)

SYSTEM_PROMPT = """
You are a payments customer support agent.
Rules:
- Answer only using retrieved policy/docs or approved transaction context.
- Never request or reveal PANs, CVVs, secrets, or full bank details.
- For refunds/disputes/account changes: require identity verification before action.
- If the answer is not in the docs/context, say what is missing and escalate.
- Mention timelines in business days when relevant.
"""

chat_engine = index.as_chat_engine(
    chat_mode="condense_question",
    system_prompt=SYSTEM_PROMPT,
)

Use low temperature here. Support responses should be consistent across agents and easy to audit.

3) Connect structured transaction data as controlled context

A real payments agent usually needs order status or refund state from an internal API. Do not dump raw database records into the prompt; fetch only the fields you want the model to see.

def get_payment_context(transaction_id: str) -> dict:
    # Replace with a real service call
    return {
        "transaction_id": transaction_id,
        "status": "settled",
        "refund_status": "pending",
        "processor_code": "R12",
        "created_at": "2026-04-18T09:15:00Z",
    }

def answer_support_question(question: str, transaction_id: str):
    tx_context = get_payment_context(transaction_id)

    prompt = f"""
Customer question: {question}

Approved transaction context:
- transaction_id: {tx_context['transaction_id']}
- status: {tx_context['status']}
- refund_status: {tx_context['refund_status']}
- processor_code: {tx_context['processor_code']}
"""

    response = chat_engine.chat(prompt)
    return {
        "answer": str(response),
        "transaction_id": transaction_id,
        "source": "llamaindex_chat_engine",
    }

result = answer_support_question(
    "Why hasn't my refund arrived yet?",
    transaction_id="txn_12345",
)
print(result["answer"])

This pattern keeps your business data in your control. The model gets just enough context to explain the state without exposing unnecessary PII.

4) Return source-backed answers with escalation paths

For payments support you want answers plus citations or at least visible evidence. LlamaIndex can return source nodes so your team can verify what informed the response.

query_response = query_engine.query(
    "What is the timeline for card refund settlement?"
)

print(str(query_response))

if hasattr(query_response, "source_nodes"):
    for node in query_response.source_nodes[:3]:
        print("SOURCE:", node.node.text[:200])

If the retrieved sources do not cover the question cleanly, route to a human queue. That is better than inventing policy around reversals or dispute windows.

Production Considerations

•
Deploy regionally for data residency
- •Keep embeddings storage and LLM traffic in approved regions if your payment data cannot cross borders.
- •If you operate in multiple jurisdictions, split indexes by region rather than sharing one global knowledge base.
•
Log everything needed for audit
- •Capture user ID, session ID, retrieved document IDs, tool calls, response text, and escalation reason.
- •For refunds or disputes, retain immutable logs long enough to satisfy compliance and chargeback review requirements.
•
Add guardrails before any action
- •The agent should not initiate refunds or account changes directly unless identity verification has passed.
- •Put a policy layer in front of tool calls so unsupported actions are blocked before they hit payment systems.
•
Monitor answer quality by intent
- •Track containment rate for intents like refund status, failed payment explanation, dispute timeline, and fee clarification.
- •Also track hallucination reports and escalations by processor code so you can fix weak retrieval coverage fast.

Common Pitfalls

•
Letting the model see raw sensitive data
- •Mistake: passing full payment records into prompts or documents.
- •Fix: redact PANs/CVVs/tokens before indexing; pass only minimal structured fields needed for support.
•
Using one generic knowledge base for all regions
- •Mistake: mixing EU retention rules with US dispute timelines and APAC data handling policies.
- •Fix: partition indexes by region and route queries based on customer jurisdiction.
•
Treating every question as answerable by RAG
- •Mistake: assuming retrieval will solve refunds that depend on live ledger state or KYC status.
- •Fix: use RAG for policies and explanations; use secure tools for live payment state; escalate when verification is required.
•
Skipping auditability
- •Mistake: returning answers without preserving sources or decision traces.
- •Fix: store retrieved nodes, prompt inputs (with redaction), output text, and any tool invocation metadata so compliance can reconstruct what happened later.

Keep learning

•The complete AI Agents Roadmap — my full 8-step breakdown
•Free: The AI Agent Starter Kit — PDF checklist + starter code
•Work with me — I build AI for banks and insurance companies

By Cyprian Aarons, AI Consultant at Topiax.

ShareX / Twitter LinkedIn

Want the complete 8-step roadmap?

Grab the free AI Agent Starter Kit — architecture templates, compliance checklists, and a 7-email deep-dive course.

Get the Starter Kit