How to Build a fraud detection Agent Using LlamaIndex in TypeScript for healthcare
A fraud detection agent for healthcare flags suspicious claims, billing patterns, and provider activity before they turn into financial loss or compliance exposure. It matters because healthcare fraud is not just a cost problem; it creates audit risk, delays legitimate reimbursement, and can trigger regulatory action if you can’t explain why a claim was flagged.
Architecture
- •
Claim ingestion layer
- •Pulls structured claim data from your EHR, claims platform, or data warehouse.
- •Normalizes fields like CPT/HCPCS codes, diagnosis codes, provider IDs, dates of service, and billed amounts.
- •
Policy and rules context
- •Stores payer policies, prior authorization rules, coding guidelines, and internal fraud heuristics.
- •Gives the agent grounding so it does not rely only on statistical similarity.
- •
Vector index over historical cases
- •Indexes past fraud investigations, denied claims, audit notes, and remediation outcomes.
- •Lets the agent retrieve similar cases before making a recommendation.
- •
LLM reasoning layer
- •Uses LlamaIndex to combine retrieved evidence with the current claim payload.
- •Produces a structured fraud risk assessment with reasons and supporting citations.
- •
Audit trail store
- •Persists every input, retrieval result, prompt version, and final decision.
- •Required for compliance review and post-incident analysis.
- •
Human review queue
- •Routes high-risk or ambiguous cases to a billing specialist or SIU analyst.
- •Prevents automatic denial without oversight.
Implementation
1) Install dependencies and define your data model
Use LlamaIndex’s TypeScript package plus a JSON schema validator if you want strict outputs. In healthcare, keep PHI out of logs and only pass the minimum necessary fields into the agent.
npm install llamaindex zod
import { Document } from "llamaindex";
export interface ClaimRecord {
claimId: string;
memberId: string;
providerId: string;
cptCodes: string[];
icd10Codes: string[];
placeOfService: string;
billedAmount: number;
dateOfService: string;
submittedAt: string;
}
export function claimToDocument(claim: ClaimRecord): Document {
return new Document({
id_: claim.claimId,
text: JSON.stringify(claim),
metadata: {
claimId: claim.claimId,
providerId: claim.providerId,
billedAmount: claim.billedAmount,
dateOfService: claim.dateOfService,
submittedAt: claim.submittedAt,
},
});
}
2) Build an index over historical fraud cases
This is where LlamaIndex earns its keep. You are not asking the model to invent fraud patterns from scratch; you are retrieving similar investigations and using them as evidence.
import {
VectorStoreIndex,
storageContextFromDefaults,
} from "llamaindex";
import { claimToDocument, ClaimRecord } from "./claims";
async function buildFraudIndex(history: ClaimRecord[]) {
const docs = history.map(claimToDocument);
const index = await VectorStoreIndex.fromDocuments(docs);
const storageContext = await storageContextFromDefaults();
await index.storageContext.persist({
persistDir: "./storage/fraud-index",
storageContext,
});
return index;
}
If your environment needs data residency controls, point persistence to a region-locked object store or self-hosted vector DB instead of local disk.
3) Query the index with a current claim and generate a risk assessment
Use asQueryEngine() for retrieval plus synthesis. The pattern below returns a structured answer that your downstream workflow can score and route.
import {
VectorStoreIndex,
} from "llamaindex";
import { z } from "zod";
const FraudAssessmentSchema = z.object({
riskLevel: z.enum(["low", "medium", "high"]),
reasons: z.array(z.string()),
recommendedAction: z.enum(["auto_approve", "manual_review", "escalate_siu"]),
});
async function assessClaim(claimJson: string) {
const index = await VectorStoreIndex.fromPersisted({
persistDir: "./storage/fraud-index",
});
const queryEngine = index.asQueryEngine({
similarityTopK: 5,
responseMode: "compact",
});
const prompt = `
You are a healthcare fraud detection assistant.
Assess this claim for fraud indicators using retrieved historical cases and billing logic.
Return only valid JSON with keys riskLevel, reasons, recommendedAction.
Claim:
${claimJson}
Focus on:
- duplicate billing
- unbundling
- impossible dates of service
- unusual provider utilization
- mismatch between diagnosis and procedure codes
`;
const response = await queryEngine.query({ queryStr: prompt });
const parsed = FraudAssessmentSchema.parse(JSON.parse(response.response));
return parsed;
}
4) Put it behind a review workflow
Do not let the model directly deny claims. Use it as a triage signal that feeds rules-based thresholds and human review. That keeps you safer on compliance and easier to defend in audits.
async function routeClaim(claimJson: string) {
const assessment = await assessClaim(claimJson);
if (assessment.riskLevel === "high") {
return {
action: "manual_review",
reasonCodes: assessment.reasons,
queue: "siu",
};
}
if (assessment.riskLevel === "medium") {
return {
action: "manual_review",
reasonCodes: assessment.reasons,
queue: "billing_audit",
};
}
return {
action: "auto_approve",
reasonCodes: assessment.reasons,
};
}
Production Considerations
- •
Compliance first
- •Treat all claim payloads as sensitive health data.
- •Minimize PHI in prompts, encrypt data at rest/in transit, and maintain access controls aligned with HIPAA or local equivalents.
- •
Auditability
- •Persist the exact retrieved chunks, prompt template version, model version, and final output.
- •If an auditor asks why a claim was flagged, you need traceable evidence rather than “the model said so.”
- •
Data residency
- •Keep embeddings, vector stores, and logs in the same jurisdiction as your regulated data.
- •If you serve multiple regions, isolate indexes per region instead of mixing patient data across borders.
- •
Monitoring
- •Track false positives by provider specialty, payer type, and code family.
- •Watch for drift when billing policies change or new abuse patterns appear.
Common Pitfalls
- •
Using free-form LLM output in production
- •If you parse plain text decisions manually, you will eventually break routing logic.
- •Force structured JSON output with a schema like
zodand reject malformed responses.
- •
Indexing raw PHI without governance
- •Dumping notes or full chart text into the vector store creates compliance debt fast.
- •Store only what is needed for fraud analysis and redact identifiers where possible.
- •
Letting the agent make final adjudication decisions
- •Fraud detection should triage; it should not be the final authority on denial or escalation.
- •Keep deterministic business rules and human review in the loop for anything high impact.
- •
Ignoring regional policy differences
- •Healthcare billing rules vary by payer, state, country, and specialty.
- •Partition retrieval context by jurisdiction so the agent does not apply one region’s policy to another’s claims.
Keep learning
- •The complete AI Agents Roadmap — my full 8-step breakdown
- •Free: The AI Agent Starter Kit — PDF checklist + starter code
- •Work with me — I build AI for banks and insurance companies
By Cyprian Aarons, AI Consultant at Topiax.
Want the complete 8-step roadmap?
Grab the free AI Agent Starter Kit — architecture templates, compliance checklists, and a 7-email deep-dive course.
Get the Starter Kit