How to Build a customer support Agent Using LlamaIndex in TypeScript for retail banking

By Cyprian AaronsUpdated 2026-04-21
customer-supportllamaindextypescriptretail-banking

A customer support agent for retail banking answers account, card, loan, and branch questions by retrieving policy-approved information and then generating a grounded response. It matters because banking support has a low tolerance for hallucinations, weak audit trails, and data handling mistakes.

Architecture

  • User channel adapter

    • Handles web chat, mobile app chat, or contact-center handoff.
    • Normalizes incoming messages into a single request format.
  • Knowledge ingestion layer

    • Pulls from FAQs, product docs, fee schedules, card dispute policies, and service scripts.
    • Splits content into chunks and indexes it with LlamaIndex.
  • Retrieval layer

    • Uses VectorStoreIndex plus a retriever to fetch only relevant policy snippets.
    • Keeps the model grounded in approved banking content.
  • Response orchestration

    • Uses QueryEngine or ChatEngine to combine retrieved context with the user question.
    • Adds system rules for compliance, refusal behavior, and escalation.
  • Audit and logging

    • Stores prompts, retrieved sources, answer text, and confidence metadata.
    • Supports internal review for regulated workflows.
  • Guardrail layer

    • Detects requests involving PII, account-specific actions, or prohibited advice.
    • Routes those cases to authenticated tools or human agents.

Implementation

1) Install dependencies and set up your index

Use the TypeScript package from LlamaIndex and load approved banking documents. For production, keep your source files in a controlled repository or document store with access review.

npm install llamaindex
import {
  Document,
  SimpleDirectoryReader,
  VectorStoreIndex,
} from "llamaindex";

async function buildIndex() {
  const reader = new SimpleDirectoryReader();
  const documents = await reader.loadData("./banking-content");

  const docs = documents.map(
    (doc) =>
      new Document({
        text: doc.text,
        metadata: {
          source: doc.metadata?.fileName ?? "unknown",
          domain: "retail-banking",
        },
      }),
  );

  return await VectorStoreIndex.fromDocuments(docs);
}

This pattern works well when your content is already approved by compliance. If you need stronger controls, ingest only versioned PDFs or markdown files that have been reviewed by legal and product owners.

2) Create a retrieval-based query engine

For customer support, don’t let the model answer from memory. Force it to retrieve from your index first.

import { OpenAI } from "llamaindex";

async function createSupportAgent() {
  const index = await buildIndex();

  const queryEngine = index.asQueryEngine({
    similarityTopK: 3,
    llm: new OpenAI({
      model: "gpt-4o-mini",
      temperature: 0,
    }),
    systemPrompt: `
You are a retail banking support assistant.
Answer only from retrieved context.
If the answer is not in context, say you cannot confirm it and offer escalation.
Never request full card numbers, PINs, passwords, or OTPs.
`,
  });

  return queryEngine;
}

The important part here is temperature: 0 plus retrieval-first prompting. In banking support, deterministic behavior beats creative phrasing.

3) Add a safe response wrapper with escalation logic

You need a thin policy layer before and after retrieval. This is where you block sensitive requests and route account-specific actions to authenticated systems or humans.

function requiresEscalation(message: string): boolean {
  const sensitivePatterns = [
    /full card number/i,
    /pin/i,
    /otp/i,
    /password/i,
    /transfer money/i,
    /close my account/i,
    /chargeback status/i,
  ];

  return sensitivePatterns.some((pattern) => pattern.test(message));
}

async function answerSupportMessage(message: string) {
  if (requiresEscalation(message)) {
    return {
      answer:
        "I can help with general policy questions. For account-specific requests or sensitive data changes, I’m routing this to a verified support agent.",
      escalated: true,
    };
  }

  const agent = await createSupportAgent();
  const response = await agent.query({ query: message });

  return {
    answer: response.toString(),
    escalated: false,
    sources: response.sourceNodes?.map((node) => ({
      text: node.node.getText(),
      source: node.node.metadata?.source,
    })),
  };
}

This keeps the agent inside its lane. Retail banking support should explain policies like fee waivers or card replacement timelines, not perform unauthenticated operations.

4) Return answers with citations

For regulated environments, every answer should be traceable back to source material. LlamaIndex gives you source nodes from the query response; use them.

async function main() {
  const result = await answerSupportMessage(
    "What is the replacement time for a lost debit card?",
  );

  console.log(JSON.stringify(result, null, 2));
}

main().catch(console.error);

A good support experience here is simple:

  • concise answer
  • source citation
  • escalation path if the content is missing or user-specific

Production Considerations

  • Deploy in-region

    • Keep embeddings, vector store data, and logs in the same jurisdiction as your bank’s residency requirements.
    • If your bank requires EU-only or US-only processing, enforce that at the infrastructure layer.
  • Log for auditability

    • Store user input, retrieved chunks, model output, timestamp, tenant ID, and escalation outcome.
    • Mask PII before logs hit observability systems like Datadog or CloudWatch.
  • Add guardrails for regulated intents

    • Block requests involving authentication secrets, payment instructions without verification, disputes requiring case lookup, and complaints needing official workflow handling.
    • Use deterministic rules first; do not rely on the LLM to self-police every time.
  • Monitor retrieval quality

    • Track top-k hit rate, unanswered questions, hallucination reports from QA teams, and deflection rate to human agents.
    • If users keep asking about something the index cannot answer accurately, your knowledge base is incomplete or stale.

Common Pitfalls

  • Letting the model answer without retrieval

    If you skip VectorStoreIndex or don’t force grounded responses through asQueryEngine, you’ll get fluent nonsense on fee rules or eligibility criteria. Fix it by making retrieval mandatory for every support turn.

  • Indexing unapproved content

    Pulling in internal notes, drafts, Slack exports, or outdated PDFs creates compliance risk fast. Only ingest reviewed content with clear ownership and version control.

  • Ignoring authentication boundaries

    A chatbot should never expose balances, transaction history، or account changes just because a user sounds legitimate. Require verified sessions and route any account-specific action through authenticated backend services.

Retail banking support agents work when they are narrow. Keep the model grounded in approved documents, keep sensitive actions behind auth gates, and keep an audit trail that compliance can actually use.


Keep learning

By Cyprian Aarons, AI Consultant at Topiax.

Want the complete 8-step roadmap?

Grab the free AI Agent Starter Kit — architecture templates, compliance checklists, and a 7-email deep-dive course.

Get the Starter Kit

Related Guides