How to Build a claims processing Agent Using LlamaIndex in TypeScript for healthcare
A claims processing agent in healthcare takes in a claim, checks it against policy rules, member eligibility, prior authorization requirements, and clinical documentation, then produces a decision package for a human reviewer or downstream system. It matters because the expensive part of claims operations is not just adjudication — it’s handling exceptions, reducing manual review time, and keeping every decision explainable for compliance and audit.
Architecture
Build this agent as a pipeline, not a single prompt.
- •
Claim intake layer
- •Accepts structured claim payloads: member ID, CPT/HCPCS codes, ICD-10 codes, provider NPI, dates of service, attachments.
- •Normalizes input into a consistent schema before retrieval or reasoning.
- •
Policy and benefits retrieval
- •Uses
VectorStoreIndexover plan documents, payer policy PDFs, medical necessity guidelines, and prior auth rules. - •Retrieves only the relevant policy snippets for the claim being reviewed.
- •Uses
- •
Rules engine
- •Handles deterministic checks first: eligibility dates, code combinations, missing fields, authorization presence.
- •Keeps hard compliance logic out of the LLM path.
- •
LLM decision layer
- •Uses LlamaIndex query engines to summarize evidence and produce a recommendation.
- •Generates an auditable rationale tied to retrieved sources.
- •
Audit and trace store
- •Persists input claim hash, retrieved nodes, model output, timestamps, reviewer overrides.
- •Required for HIPAA-aligned traceability and internal QA.
- •
Human review handoff
- •Routes low-confidence or high-risk claims to an adjuster or nurse reviewer.
- •Prevents autonomous denial on ambiguous cases.
Implementation
1) Install dependencies and define your claim schema
Use TypeScript with the core LlamaIndex package. Keep the claim payload explicit so you can validate it before any model call.
npm install llamaindex zod
import { z } from "zod";
export const ClaimSchema = z.object({
claimId: z.string(),
memberId: z.string(),
providerNpi: z.string(),
icd10Codes: z.array(z.string()).min(1),
cptCodes: z.array(z.string()).min(1),
dateOfService: z.string(),
placeOfService: z.string(),
priorAuthId: z.string().optional(),
});
export type ClaimInput = z.infer<typeof ClaimSchema>;
This keeps PHI-bearing inputs controlled at the boundary. In healthcare workflows, bad schema discipline becomes audit debt fast.
2) Load policy documents into a VectorStoreIndex
For real deployments, index only approved policy content. Do not mix raw claims data with policy embeddings unless you have a clear data residency and retention plan.
import {
Document,
VectorStoreIndex,
} from "llamaindex";
const docs = [
new Document({
text: `
MRI lumbar spine requires prior authorization when performed outpatient
unless emergent red flags are documented. Review medical necessity criteria.
`,
metadata: { source: "policy_mri_lumbar_v3.pdf", jurisdiction: "US" },
}),
new Document({
text: `
Claims with CPT 99213 on same date as procedure may require modifier review.
Check bundling edits and provider specialty.
`,
metadata: { source: "claims_edits_2025.qrg", jurisdiction: "US" },
}),
];
const index = await VectorStoreIndex.fromDocuments(docs);
const retriever = index.asRetriever({ similarityTopK: 3 });
This pattern gives you retrieval grounded in approved payer content. The agent should answer from these nodes first, not from free-form memory.
3) Build the claim review function with retrieval plus deterministic checks
Do your deterministic validation before calling the LLM. Then pass both the claim and retrieved evidence into a query engine built from the same index.
import {
QueryEngineTool,
} from "llamaindex";
import { OpenAI } from "@llamaindex/openai";
import { ClaimSchema } from "./schema";
const llm = new OpenAI({
model: "gpt-4o-mini",
});
async function reviewClaim(rawClaim: unknown) {
const claim = ClaimSchema.parse(rawClaim);
const ruleFlags = [];
if (!claim.priorAuthId && claim.cptCodes.includes("72148")) {
ruleFlags.push("Possible missing prior authorization for lumbar MRI");
}
const queryEngine = index.asQueryEngine({
retriever,
llm,
responseMode: "compact",
similarityTopK: 3,
});
const prompt = `
You are reviewing a healthcare insurance claim.
Return:
1) recommendation (approve / pend / deny)
2) rationale
3) cited evidence
4) confidence
Claim:
${JSON.stringify(claim)}
Deterministic flags:
${ruleFlags.join("\n") || "none"}
`;
const response = await queryEngine.query({ query: prompt });
const tool = new QueryEngineTool({
queryEngine,
metadata: {
name: "policy_lookup",
description: "Searches approved claims policy documents",
},
});
return {
claimId: claim.claimId,
flags: ruleFlags,
resultText: response.toString(),
toolName: tool.metadata.name,
};
}
The important part is that the LLM is not making raw decisions in isolation. It’s reviewing a constrained context built from validated input plus retrieved policy evidence.
4) Add an audit record for every decision
Healthcare systems need replayable decisions. Persist enough detail to reconstruct why the agent recommended approve/pend/deny.
type AuditRecord = {
claimId: string;
timestamp: string;
policySources?: string[];
summary?: string;
};
async function writeAudit(record: AuditRecord) {
console.log(JSON.stringify(record));
}
async function processClaim(rawClaim: unknown) {
const result = await reviewClaim(rawClaim);
await writeAudit({
claimId: result.claimId,
timestamp: new Date().toISOString(),
policySources: ["policy_mri_lumbar_v3.pdf", "claims_edits_2025.qrg"],
summary:
result.resultText.slice(0,1000),
});
return result;
}
That audit trail should go to an immutable store in production. If legal or compliance asks why a claim was pended six months later, you need exact inputs and sources.
Production Considerations
- •
Data residency
- •Keep PHI and indexed policy content in-region.
- •If your vector database or LLM endpoint crosses regions, document it and get security sign-off.
- •
Monitoring
- •Track approval rate, pend rate, denial rate, retrieval hit quality, and human override rate.
- •Alert when the model starts over-relying on weak evidence or when certain CPT groups spike in manual review.
- •
Guardrails
- •Block autonomous denial when confidence is low or when required evidence is missing.
- •Force human review for high-dollar claims, behavioral health claims, oncology claims, or any case involving incomplete documentation.
- •
Compliance logging
- •Store prompt version, model version, retrieved node IDs, and final action.
- •Treat these as regulated artifacts; they belong in your audit pipeline, not application logs alone.
Common Pitfalls
- •
Using the LLM as the rules engine
Don’t ask the model to decide eligibility or coding edits from scratch. Put deterministic checks in code first; use LlamaIndex for retrieval-backed explanation and triage.
- •
Indexing raw PHI without controls
Claims data often contains identifiers you do not want in general-purpose indexes. Separate policy indexes from case-specific stores and apply retention limits aggressively.
- •
No explanation path for reviewers
A recommendation without cited policy text will get rejected by operations teams. Always return source snippets or document references alongside the decision so reviewers can verify it quickly.
Keep learning
- •The complete AI Agents Roadmap — my full 8-step breakdown
- •Free: The AI Agent Starter Kit — PDF checklist + starter code
- •Work with me — I build AI for banks and insurance companies
By Cyprian Aarons, AI Consultant at Topiax.
Want the complete 8-step roadmap?
Grab the free AI Agent Starter Kit — architecture templates, compliance checklists, and a 7-email deep-dive course.
Get the Starter Kit