How to Build a customer support Agent Using LlamaIndex in TypeScript for lending

By Cyprian AaronsUpdated 2026-04-21

customer-supportllamaindextypescriptlending

A customer support agent for lending answers borrower questions, pulls policy-backed responses from your knowledge base, and routes risky or account-specific requests to the right workflow. That matters because lending support is not just about speed; it has to stay aligned with compliance, avoid hallucinating payment terms, and keep a clean audit trail for every answer.

Architecture

•
Channel adapter
- •Receives chat messages from web, mobile, or CRM.
- •Normalizes user input into a single request shape.
•
Lending knowledge index
- •Stores FAQs, product guides, underwriting policies, repayment rules, and hardship procedures.
- •Built with VectorStoreIndex so the agent can retrieve grounded answers.
•
Retriever-backed response layer
- •Uses LlamaIndex retrieval to pull relevant policy chunks before answering.
- •Keeps the model anchored to approved content.
•
Tooling layer
- •Adds account lookup, loan status checks, repayment schedule fetches, and case creation.
- •Keeps sensitive actions out of free-form generation.
•
Guardrails and policy filter
- •Blocks unsupported advice like legal interpretation or credit decisions.
- •Forces escalation when the question crosses compliance boundaries.
•
Audit and observability
- •Logs prompt inputs, retrieved sources, tool calls, and final responses.
- •Needed for dispute handling and regulatory review.

Implementation

1) Install and set up the project

Use the TypeScript packages from LlamaIndex and keep your environment variables in place for model access.

npm install llamaindex dotenv

Create a .env file:

OPENAI_API_KEY=your_key_here

Then initialize a basic support agent with retrieval over lending documents.

import "dotenv/config";
import {
  Document,
  VectorStoreIndex,
  Settings,
  OpenAI,
} from "llamaindex";

Settings.llm = new OpenAI({
  model: "gpt-4o-mini",
});

const docs = [
  new Document({
    text: `
      Payment deferrals are available for eligible borrowers experiencing temporary hardship.
      Borrowers must submit a request before the due date whenever possible.
      Approval is subject to review by servicing policy.
    `,
    metadata: { source: "hardship_policy_v1" },
  }),
  new Document({
    text: `
      Late fees apply after a grace period of 10 calendar days.
      Partial payments may not prevent delinquency status.
      Customers should be advised to review their loan agreement for exact terms.
    `,
    metadata: { source: "repayment_faq_v2" },
  }),
];

const index = await VectorStoreIndex.fromDocuments(docs);
const queryEngine = index.asQueryEngine();

const response = await queryEngine.query({
  query: "Can I defer my next payment if I lost my job?",
});

console.log(response.toString());

2) Add a support policy gate before answering

For lending, not every question should be answered directly. Questions about credit decisions, legal interpretation, adverse action reasons, or personalized eligibility should trigger escalation.

function classifySupportRequest(message: string) {
  const text = message.toLowerCase();

  const escalationTriggers = [
    "credit score",
    "why was i denied",
    "adverse action",
    "legal",
    "lawsuit",
    "regulation",
    "appeal my denial",
    "change my rate",
  ];

  const needsEscalation = escalationTriggers.some((term) => text.includes(term));

  return {
    needsEscalation,
    reason: needsEscalation ? "policy_or_compliance_sensitive" : "general_support",
  };
}

Use this gate before calling retrieval. That keeps the agent from improvising on regulated topics.

3) Wrap retrieval in an answer function with citations

The important pattern is: classify first, retrieve second, answer third. Keep the response tied to source documents so agents can cite where the answer came from.

import { QueryEngine } from "llamaindex";

async function answerBorrowerQuestion(
  queryEngine: QueryEngine,
  message: string
) {
  const classification = classifySupportRequest(message);

  if (classification.needsEscalation) {
    return {
      type: "escalation",
      message:
        "I can connect you with a specialist for that request. It involves account-specific or regulated lending guidance.",
    };
  }

  const result = await queryEngine.query({
    query: message,
  });

  return {
    type: "answer",
    message: result.toString(),
    sources:
      // `sourceNodes` is commonly available on query results depending on retriever/engine config
      (result as any).sourceNodes?.map((node: any) => ({
        text: node.node.text,
        source: node.node.metadata?.source,
        score: node.score,
      })) ?? [],
  };
}

const finalResponse = await answerBorrowerQuestion(
  queryEngine,
  "What happens if I miss my payment?"
);

console.log(JSON.stringify(finalResponse, null, 2));

If you want better control over retrieval quality, tune chunking and similarity settings on your index setup. In lending support, short policy chunks usually work better than long mixed documents because they reduce irrelevant context.

4) Add operational logging for auditability

For lending workflows, log the user message, classification result, retrieved sources, and final output. Do not log raw PII unless your retention policy allows it.

type AuditEvent = {
  timestamp: string;
  sessionId: string;
  userMessage: string;
}

function writeAudit(event: AuditEvent & Record<string, unknown>) {
}

Production Considerations

•
Data residency

Keep borrower-facing content and embeddings in the region required by your banking or lending policy. If your jurisdiction requires local processing or storage, make sure both document ingestion and vector storage comply.
•
Compliance controls

Add hard rules for topics like adverse action notices, APR changes, debt collection language, hardship eligibility, and credit reporting disputes. The agent should escalate instead of generating answers when policy says a human must respond.
•
Monitoring

Monitor retrieval hit rate, escalation rate, hallucination reports, and unanswered intents. In lending support, a spike in “payment deferral” or “denial” questions often means policy content changed and your index is stale.

•Audit trails

Record which documents were retrieved for each response. That gives you traceability when a borrower disputes advice or an internal reviewer asks why the assistant said something specific.

Common Pitfalls

•
Letting the model answer regulated questions directly
- •Mistake: asking the LLM to explain denial reasons or legal rights without guardrails.
- •Fix: classify those requests first and route them to humans or approved templates.
•
Mixing product docs with account data in one retrieval pool
- •Mistake: indexing internal servicing notes alongside public FAQs.
- •Fix: separate public knowledge from private account context and only inject private data through controlled tools.
•
Ignoring stale policy content
- •Mistake: shipping an index built from last quarter’s repayment rules.
- •Fix: version your documents by effective date and rebuild indexes whenever lending policies change.
•
Logging too much sensitive data
- •Mistake: storing full SSNs, bank details, or complaint narratives in plain logs.
- •Fix: redact PII at ingress and keep audit logs minimal but sufficient for compliance review.

Keep learning

•The complete AI Agents Roadmap — my full 8-step breakdown
•Free: The AI Agent Starter Kit — PDF checklist + starter code
•Work with me — I build AI for banks and insurance companies

By Cyprian Aarons, AI Consultant at Topiax.

ShareX / Twitter LinkedIn

Want the complete 8-step roadmap?

Grab the free AI Agent Starter Kit — architecture templates, compliance checklists, and a 7-email deep-dive course.

Get the Starter Kit