How to Build a fraud detection Agent Using LlamaIndex in TypeScript for pension funds

By Cyprian AaronsUpdated 2026-04-21
fraud-detectionllamaindextypescriptpension-funds

A fraud detection agent for pension funds watches member activity, contribution flows, benefit claims, and administrator actions for patterns that don’t fit normal behavior. It matters because pension fraud is usually low-frequency but high-impact: a single bad actor can trigger regulatory exposure, member harm, and expensive remediation across years of records.

Architecture

  • Document ingestion layer

    • Pulls in PDFs, emails, claim forms, transaction exports, KYC records, and administrator notes.
    • Normalizes them into text chunks with metadata like memberId, caseId, jurisdiction, and sourceSystem.
  • LlamaIndex retrieval layer

    • Uses VectorStoreIndex to retrieve prior cases, policy docs, and suspicious-pattern playbooks.
    • Keeps the agent grounded in internal pension rules instead of generic fraud heuristics.
  • Fraud analysis toolchain

    • Exposes tools for anomaly scoring, policy lookup, and case history retrieval.
    • Lets the agent compare a new event against known fraud signatures and compliance thresholds.
  • Decision orchestration

    • Uses a chat engine or agent workflow to classify risk, explain why, and recommend next steps.
    • Produces structured outputs for review queues and audit trails.
  • Audit and evidence store

    • Persists prompts, retrieved sources, final decisions, and timestamps.
    • Required for pension governance, regulator review, and internal model risk controls.

Implementation

1) Install dependencies and load your data

Use LlamaIndex TS packages plus a real LLM provider. For a production setup, keep the vector store separate from your app process.

npm install llamaindex dotenv
import "dotenv/config";
import {
  Document,
  SimpleDirectoryReader,
  VectorStoreIndex,
} from "llamaindex";

async function loadDocuments() {
  const reader = new SimpleDirectoryReader();
  const docs = await reader.loadData("./pension-docs");

  return docs.map(
    (doc) =>
      new Document({
        text: doc.text,
        metadata: {
          source: doc.metadata?.fileName ?? "unknown",
          jurisdiction: "UK",
          domain: "pension-fraud",
        },
      }),
  );
}

This example loads local files like policy PDFs converted to text. In practice, you’d ingest claim forms, admin tickets, bank statements, and case notes through a controlled pipeline with PII redaction before indexing.

2) Build an index over fraud cases and policy documents

The agent needs retrieval over both historical fraud cases and pension scheme rules. That keeps its answers explainable and aligned with fund policy.

import { OpenAI } from "llamaindex";
import { QueryEngineTool } from "llamaindex";

async function buildFraudIndex() {
  const docs = await loadDocuments();

  const index = await VectorStoreIndex.fromDocuments(docs);

  const queryEngine = index.asQueryEngine({
    similarityTopK: 4,
  });

  return queryEngine;
}

The important part here is VectorStoreIndex.fromDocuments(). That gives you semantic search across prior investigations so the agent can answer questions like: “Has this address ever appeared in duplicate benefit claims?” or “What does our scheme manual say about third-party bank account changes?”

3) Wrap retrieval as a tool and create the agent

For fraud detection you want the model to call tools instead of guessing. This is where LlamaIndex’s agent layer helps: the model can retrieve evidence before it labels an event as suspicious.

import {
  FunctionTool,
  ReActAgent,
} from "llamaindex";

async function main() {
  const queryEngine = await buildFraudIndex();
  const llm = new OpenAI({
    model: "gpt-4o-mini",
    apiKey: process.env.OPENAI_API_KEY,
  });

  const fraudLookupTool = FunctionTool.from(
    async ({ question }: { question: string }) => {
      const response = await queryEngine.query({ query: question });
      return response.toString();
    },
    {
      name: "fraud_lookup",
      description:
        "Search pension fraud cases, scheme rules, and suspicious-pattern guidance.",
    },
  );

  const agent = new ReActAgent({
    tools: [fraudLookupTool],
    llm,
    verbose: true,
    systemPrompt:
      "You are a pension fund fraud detection assistant. Use retrieved evidence only. Flag compliance risks clearly.",
  });

  const result = await agent.chat({
    message:
      "Review this case: member changed bank account twice in two days after requesting an early transfer. Is this suspicious?",
  });

  console.log(result.toString());
}

main().catch(console.error);

That pattern is production-friendly because it separates retrieval from reasoning. The tool returns evidence; the agent explains whether the case is suspicious based on that evidence.

4) Force structured output for case management

For operations teams, free-form prose is not enough. You want a JSON payload that can be pushed into a case system with severity, rationale, and cited evidence.

type FraudAssessment = {
  riskLevel: "low" | "medium" | "high";
  reasons: string[];
};

async function assessCase(agentResponse: string): Promise<FraudAssessment> {
  // In production use schema validation after parsing model output.
  return JSON.parse(agentResponse) as FraudAssessment;
}

A better pattern is to instruct the agent to emit strict JSON and validate it with Zod before writing to your queue. That gives analysts consistent triage fields across all cases.

Production Considerations

  • Compliance logging

    • Store every prompt, retrieved document ID, response, model version, and timestamp.
    • Pension funds need auditable decisions for trustees, regulators, and internal control reviews.
  • Data residency

    • Keep member data in-region if your fund has UK/EU residency requirements.
    • If you use hosted embeddings or external LLMs, confirm where content is processed and retained.
  • Guardrails on actioning

    • The agent should flag risk; it should not auto-freeze accounts or deny benefits.
    • Route only high-confidence cases to human review with source citations attached.
  • Monitoring

    • Track false positives by scenario type: bank detail changes, transfer requests, death claims, address changes.
    • Monitor retrieval quality too; bad chunks or missing policy docs will produce confident but useless answers.

Common Pitfalls

  • Using generic fraud prompts without pension context

    Generic anti-fraud logic misses scheme-specific abuse like impersonation during transfer requests or forged beneficiary updates. Fix this by indexing scheme rules, admin procedures, trustee policies, and historical case outcomes together.

  • Skipping evidence traceability

    If the agent cannot show which documents influenced its answer, auditors will reject it. Always return source metadata from retrieval and persist it with the decision record.

  • Letting the model make final decisions

    A pension fraud agent should support investigators, not replace them. Keep humans in the loop for any action that affects benefits payment status or member access.


Keep learning

By Cyprian Aarons, AI Consultant at Topiax.

Want the complete 8-step roadmap?

Grab the free AI Agent Starter Kit — architecture templates, compliance checklists, and a 7-email deep-dive course.

Get the Starter Kit

Related Guides