How to Build a customer support Agent Using LlamaIndex in TypeScript for wealth management
A customer support agent for wealth management answers client questions about account access, fees, product eligibility, transfer status, and policy documents without forcing every request through a human advisor. It matters because the agent has to be accurate, compliant, and auditable; in this domain, a wrong answer is not just bad UX, it can create regulatory and client-risk exposure.
Architecture
- •
Client-facing chat API
- •Receives user messages from web, mobile, or advisor portals.
- •Keeps session state and conversation IDs for audit trails.
- •
Knowledge retrieval layer
- •Indexes approved documents: fee schedules, product brochures, account-opening rules, service policies, and FAQs.
- •Uses
VectorStoreIndexplus a retriever to ground responses in source material.
- •
Policy and compliance guardrail
- •Blocks unsupported advice requests like “Should I buy this fund?”
- •Routes sensitive topics to human review when the agent detects suitability, tax advice, or complaints.
- •
Response synthesis layer
- •Uses
QueryEngineto generate concise answers with citations. - •Keeps responses constrained to retrieved context.
- •Uses
- •
Audit logging
- •Stores prompt, retrieved nodes, model output, timestamps, and user identity.
- •Supports internal review and regulatory evidence.
- •
Data governance controls
- •Enforces document approval status and region-specific storage.
- •Prevents mixing public FAQ content with non-public client data unless explicitly allowed.
Implementation
1) Install the TypeScript packages
Use the LlamaIndex TypeScript SDK plus an OpenAI-compatible model provider. For production, pin versions and keep your embedding/model providers explicit.
npm install llamaindex dotenv
Set your environment variables:
OPENAI_API_KEY=your_key
2) Load approved wealth management content into an index
This example uses local files that contain approved support content. In practice, these should come from a controlled document pipeline with versioning and legal approval status.
import "dotenv/config";
import { VectorStoreIndex, SimpleDirectoryReader } from "llamaindex";
async function buildIndex() {
const docs = await new SimpleDirectoryReader().loadData({
directoryPath: "./knowledge-base",
});
const index = await VectorStoreIndex.fromDocuments(docs);
return index;
}
This is the core pattern: load only approved documents, convert them into nodes internally, then build a retrieval index. Do not dump raw CRM exports or unapproved advisor notes into this corpus.
3) Create a query engine with grounded behavior
Use asQueryEngine() so the agent answers from indexed material instead of freewheeling. Keep the prompt narrow and force escalation for regulated topics.
import "dotenv/config";
import {
VectorStoreIndex,
SimpleDirectoryReader,
} from "llamaindex";
async function main() {
const docs = await new SimpleDirectoryReader().loadData({
directoryPath: "./knowledge-base",
});
const index = await VectorStoreIndex.fromDocuments(docs);
const queryEngine = index.asQueryEngine({
similarityTopK: 3,
systemPrompt: `
You are a customer support agent for a wealth management firm.
Answer only using the provided context.
If the user asks for investment advice, tax advice, suitability guidance,
or anything outside policy/docs, say you need to escalate to a human advisor.
Cite the relevant source when possible.
`,
});
const response = await queryEngine.query({
query: "What is the fee for managed portfolios?",
});
console.log(response.toString());
}
main();
The important part here is not just retrieval. The systemPrompt constrains behavior so support stays in policy territory. For wealth management, that boundary matters more than raw answer quality.
4) Add an escalation rule before answering
You should not let every question hit retrieval. Run a lightweight policy check first and route risky requests away from automation.
function needsEscalation(message: string): boolean {
const text = message.toLowerCase();
return [
"should i buy",
"should i sell",
"best fund",
"tax advice",
"capital gains",
"suitability",
"guaranteed return",
"complaint",
"legal",
"fiduciary",
].some((phrase) => text.includes(phrase));
}
Then wire it into your request flow:
async function answerSupportQuestion(queryEngine: any, message: string) {
if (needsEscalation(message)) {
return {
type: "handoff",
message:
"I can help with account service questions and policy information. This request needs review by a human advisor.",
};
}
const response = await queryEngine.query({ query: message });
return {
type: "answer",
message: response.toString(),
};
}
That pattern keeps regulated advice out of the automated path. It also makes your escalation logic easy to audit because it is explicit code rather than hidden prompt behavior.
Production Considerations
- •
Audit everything
- •Store user ID, session ID, query text, retrieved document IDs, model output, and escalation reason.
- •Keep immutable logs for compliance review and dispute handling.
- •
Control data residency
- •Ensure embeddings and vector storage stay in approved regions.
- •If you operate across jurisdictions, separate indexes by region so client data does not cross borders unintentionally.
- •
Monitor answer quality
- •Track retrieval hit rate, escalation rate, hallucination reports, and “no answer” frequency.
- •Review sampled conversations weekly with compliance and support ops.
- •
Add hard guardrails
- •Block PII leakage in outputs.
- •Reject requests involving portfolio recommendations unless they are explicitly routed to licensed personnel.
Common Pitfalls
- •
Using unapproved documents as knowledge sources
If your index contains draft PDFs or advisor notes without compliance approval, the agent will confidently repeat bad information. Fix this by indexing only curated content with document-level approval metadata.
- •
Treating retrieval as enough for compliance
A good vector search result does not make an answer compliant. Add explicit escalation rules for suitability, tax topics, complaints, fee disputes beyond policy text, and anything that looks like advice.
- •
Skipping audit metadata
If you cannot reconstruct what the agent saw and answered at the time of interaction, you will struggle during reviews. Log prompts, retrieved chunks or node IDs from
QueryEngine, timestamps, model version, and handoff decisions.
The production version of this agent is boring by design. It answers policy-bound support questions quickly, escalates risky ones early, and leaves an audit trail that compliance can actually use.
Keep learning
- •The complete AI Agents Roadmap — my full 8-step breakdown
- •Free: The AI Agent Starter Kit — PDF checklist + starter code
- •Work with me — I build AI for banks and insurance companies
By Cyprian Aarons, AI Consultant at Topiax.
Want the complete 8-step roadmap?
Grab the free AI Agent Starter Kit — architecture templates, compliance checklists, and a 7-email deep-dive course.
Get the Starter Kit