How to Build a customer support Agent Using LlamaIndex in TypeScript for investment banking

By Cyprian AaronsUpdated 2026-04-21
customer-supportllamaindextypescriptinvestment-banking

A customer support agent for investment banking answers client and internal operations questions against approved knowledge sources: product docs, onboarding guides, fee schedules, policy manuals, and runbooks. It matters because the support layer in banking is not just about speed; it has to be accurate, auditable, permission-aware, and safe around regulated content.

Architecture

  • Document ingestion layer

    • Pulls from approved sources only: PDFs, internal wikis, policy docs, CRM exports, and ticket macros.
    • Normalizes content into Document objects before indexing.
  • Indexing layer

    • Uses LlamaIndex to chunk and index content for retrieval.
    • For support use cases, a VectorStoreIndex is usually enough to start.
  • Retrieval and response layer

    • Fetches relevant context with a retriever.
    • Generates grounded answers through a QueryEngine.
  • Guardrails layer

    • Blocks unsupported topics like trade advice, legal interpretation, or confidential client data leakage.
    • Forces escalation when confidence is low or the query is out of scope.
  • Audit and observability layer

    • Logs question, retrieved chunks, answer, source IDs, and escalation reason.
    • Needed for compliance review and incident reconstruction.
  • Deployment boundary

    • Runs in a controlled environment with regional data residency constraints.
    • Keeps embeddings, source docs, and logs inside approved infrastructure.

Implementation

1) Install the TypeScript packages

Use the official LlamaIndex TypeScript SDK packages. For a support agent you need core indexing plus an LLM provider.

npm install llamaindex dotenv

Set your model key in .env:

OPENAI_API_KEY=your_key_here

2) Load approved banking support documents

Keep the corpus tight. Do not index raw chat logs or anything that contains client PII unless you have a clear retention and access policy.

import "dotenv/config";
import { Document } from "llamaindex";

export const supportDocs = [
  new Document({
    text: `
      Prime brokerage onboarding requires:
      - Signed MSAs
      - KYC completion
      - Authorized trader list approval
      Escalate incomplete KYC cases to compliance operations.
    `,
    metadata: { source: "pb_onboarding_playbook", department: "ops" },
  }),
  new Document({
    text: `
      Equity research distribution rules:
      Do not share restricted research outside approved recipients.
      If asked for unpublished research, escalate to compliance.
    `,
    metadata: { source: "research_distribution_policy", department: "compliance" },
  }),
];

3) Build the index and query engine

This is the core pattern. VectorStoreIndex.fromDocuments() creates the index, then asQueryEngine() gives you a retrieval-backed answer interface.

import "dotenv/config";
import {
  Document,
  VectorStoreIndex,
} from "llamaindex";

async function main() {
  const documents = [
    new Document({
      text: `
        Margin call questions must be answered using approved margin policy only.
        Never provide trading recommendations or interpret legal terms beyond policy text.
        Escalate disputes involving client balances to operations.
      `,
      metadata: { source: "margin_policy", team: "client_support" },
    }),
    new Document({
      text: `
        Wire transfer cut-off times vary by currency and region.
        Confirm the booking entity before advising on settlement timing.
        If the request mentions sanctions screening or blocked payments, escalate immediately.
      `,
      metadata: { source: "payments_runbook", team: "payments_ops" },
    }),
  ];

  const index = await VectorStoreIndex.fromDocuments(documents);

  const queryEngine = index.asQueryEngine({
    similarityTopK: 3,
  });

  const response = await queryEngine.query({
    query: "What should I tell a client asking why their wire transfer is delayed?",
  });

  console.log(response.toString());
}

main().catch(console.error);

This gets you grounded retrieval with minimal surface area. In production, replace the inline documents with your ingestion pipeline that loads from S3, SharePoint, Confluence exports, or an internal document store.

4) Add a bank-specific guardrail before answering

For investment banking support, the agent should refuse anything that looks like advice outside its remit. A simple pre-check prevents bad answers from ever hitting the model.

const blockedTopics = [
  /buy|sell|recommend/i,
  /trade idea/i,
  /insider/i,
  /circumvent.*compliance/i,
];

function shouldEscalate(question: string): boolean {
  return blockedTopics.some((pattern) => pattern.test(question));
}

async function answerQuestion(queryEngine: any, question: string) {
  if (shouldEscalate(question)) {
    return {
      answer:
        "I can’t help with trading advice or compliance-sensitive requests. Please escalate to the appropriate desk or compliance team.",
      escalated: true,
    };
  }

  const response = await queryEngine.query({ query: question });
  return {
    answer: response.toString(),
    escalated: false,
    sourcesUsed: response.sourceNodes?.map((n: any) => n.node.metadata?.source) ?? [],
  };
}

That pattern is boring on purpose. In banking support systems, boring beats clever every time.

Production Considerations

  • Data residency

    • Keep embeddings and document storage in-region if your booking entities require it.
    • If your bank has EU clients or regulated entities in specific jurisdictions, do not route retrieval traffic outside approved regions.
  • Auditability

    • Persist the original question, top retrieved chunks, final answer, timestamps, user identity, and escalation outcome.
    • This gives compliance teams something they can review when a client disputes guidance.
  • Monitoring

    • Track refusal rate, escalation rate, retrieval hit rate, and hallucination reports from support analysts.
    • A spike in “no relevant context found” usually means your corpus is stale or too fragmented.
  • Guardrails

    • Enforce allowlisted topics only: onboarding status, product process questions, payment status workflows, contact paths.

Common Pitfalls

  • Indexing too much sensitive data

    Don’t dump raw emails or CRM notes into the vector store. Strip PII first and only ingest approved operational content with clear retention rules.

  • Letting the model answer outside policy

    If you skip pre-checks and escalation logic, the agent will eventually produce unsupported financial guidance. Put hard refusals in front of the LLM for trade advice, legal interpretation, sanctions issues, and confidential client data requests.

  • Ignoring source traceability

    A support agent without citations is hard to defend in audit reviews. Always keep metadata.source on your documents and return source references with every answer so operations can verify where the response came from.


Keep learning

By Cyprian Aarons, AI Consultant at Topiax.

Want the complete 8-step roadmap?

Grab the free AI Agent Starter Kit — architecture templates, compliance checklists, and a 7-email deep-dive course.

Get the Starter Kit

Related Guides