How to Build a transaction monitoring Agent Using LlamaIndex in TypeScript for healthcare
A transaction monitoring agent in healthcare watches claims, payments, refunds, prior auth events, and access patterns to flag suspicious or policy-breaking activity before it becomes a compliance issue. It matters because healthcare data is regulated, financial leakage is expensive, and false positives can overwhelm operations if the agent is not built with domain rules and auditability from day one.
Architecture
- •
Event ingestion layer
- •Pulls transactions from Kafka, SQS, webhook handlers, or database change streams.
- •Normalizes claim IDs, provider IDs, member IDs, CPT/ICD codes, amounts, timestamps, and locations.
- •
Policy and rules engine
- •Encodes hard healthcare rules like duplicate billing, out-of-network anomalies, unusual refund patterns, and impossible service timing.
- •Produces deterministic flags before any LLM reasoning runs.
- •
LlamaIndex agent layer
- •Uses
OpenAIor another supported LLM through LlamaIndex. - •Orchestrates tool calls for retrieval, summarization, and classification.
- •Uses
- •
Retrieval index over policy and case history
- •Stores internal SOPs, payer rules, audit playbooks, prior incidents, and escalation runbooks.
- •Lets the agent ground decisions in approved policy text instead of hallucinating.
- •
Case management output
- •Writes structured findings to a case system with risk score, evidence snippets, rule hits, and recommended next action.
- •Keeps human reviewers in the loop for anything ambiguous.
Implementation
1) Install the packages and define your event shape
Use the TypeScript packages from LlamaIndex and keep your transaction schema strict. In healthcare workflows, you want typed inputs because unstructured JSON from upstream systems becomes a compliance problem fast.
npm install llamaindex zod
import { z } from "zod";
export const TransactionSchema = z.object({
transactionId: z.string(),
memberId: z.string(),
providerId: z.string(),
amount: z.number(),
currency: z.string().default("USD"),
transactionType: z.enum(["claim", "refund", "prior_auth", "payment"]),
serviceCode: z.string(),
location: z.string(),
timestamp: z.string(),
});
export type Transaction = z.infer<typeof TransactionSchema>;
2) Build a small policy index for retrieval
Keep your policies in source-controlled text files or a controlled document store. Then load them into a vector index so the agent can cite the exact rule it used during review.
import {
Document,
VectorStoreIndex,
} from "llamaindex";
const policyDocs = [
new Document({
text: `
Duplicate billing rule:
Flag claims when the same memberId, providerId, serviceCode,
and date of service appear more than once within a 24 hour window.
Escalate to compliance if amount exceeds $5000.
`,
metadata: { source: "billing_policy_v1" },
}),
new Document({
text: `
Refund anomaly rule:
Refunds above $1000 require manual review.
Refunds issued within 48 hours of claim submission should be reviewed
for potential abuse or processing error.
`,
metadata: { source: "refund_policy_v1" },
}),
];
const policyIndex = await VectorStoreIndex.fromDocuments(policyDocs);
const policyRetriever = policyIndex.asRetriever({ similarityTopK: 2 });
3) Create an agent that combines rules plus retrieval
The pattern here is simple: run deterministic checks first, then use LlamaIndex to retrieve relevant policy context and generate a structured assessment. For healthcare monitoring this gives you auditability without giving up flexible reasoning.
import {
OpenAI,
} from "llamaindex";
const llm = new OpenAI({
model: "gpt-4o-mini",
});
function deterministicFlags(tx: Transaction): string[] {
const flags: string[] = [];
if (tx.transactionType === "refund" && tx.amount > 1000) {
flags.push("Refund exceeds manual review threshold");
}
if (tx.transactionType === "claim" && tx.amount > 5000) {
flags.push("High-value claim");
}
return flags;
}
export async function assessTransaction(rawTx: unknown) {
const tx = TransactionSchema.parse(rawTx);
const ruleFlags = deterministicFlags(tx);
const retrieved = await policyRetriever.retrieve({
query: `${tx.transactionType} ${tx.serviceCode} ${tx.amount} ${ruleFlags.join(" ")}`,
});
const contextText = retrieved
.map((node) => node.node.getContent())
.join("\n\n");
const prompt = `
You are a healthcare transaction monitoring assistant.
Return JSON with fields:
risk_level (low|medium|high),
reason,
evidence,
recommended_action
Transaction:
${JSON.stringify(tx)}
Rule flags:
${ruleFlags.join("; ") || "none"}
Relevant policy context:
${contextText}
`;
const response = await llm.complete({ prompt });
return {
transactionId: tx.transactionId,
ruleFlags,
assessmentText: response.text,
retrievedPolicies: retrieved.map((r) => r.node.metadata),
};
}
4) Persist outputs for audit and human review
Do not let the agent be the system of record. Store the raw input, rule hits, retrieved policy references, model output, and reviewer decision in an append-only store.
type CaseRecord = {
transactionId: string;
riskLevel?: string;
assessmentText: string;
ruleFlags: string[];
retrievedPolicies: unknown[];
createdAt: string;
};
export async function writeCase(record: CaseRecord) {
// Replace with Postgres + immutable audit table or your case system API.
console.log(JSON.stringify(record));
}
Production Considerations
- •
Data residency
- •Keep PHI inside approved regions only.
- •If your LLM endpoint crosses regions, block it at the network layer before deployment.
- •
Audit trails
- •Log every input hash, retrieved document ID, model version, prompt template version, and final decision.
- •
Guardrails
- •Redact PHI fields that are not needed for risk scoring.
- •Use allowlisted tools only; do not expose arbitrary database access to the agent.
- •
Monitoring
- •Track false positive rate by transaction type and provider segment.
- •Alert on drift in amount distributions, refund frequency, and escalation volume.
Common Pitfalls
- •
Using the LLM as the first line of defense
Start with deterministic rules. In healthcare monitoring you need explainable triggers before any generative step runs.
- •
Retrieving broad policy documents without metadata control
Chunk policies by domain like billing, refunds, access control, and prior auth. Otherwise retrieval returns noisy context that weakens decisions.
- •
Skipping reviewer feedback loops
Every high-risk case should capture analyst disposition. Without that feedback you cannot tune thresholds or measure precision over time.
Keep learning
- •The complete AI Agents Roadmap — my full 8-step breakdown
- •Free: The AI Agent Starter Kit — PDF checklist + starter code
- •Work with me — I build AI for banks and insurance companies
By Cyprian Aarons, AI Consultant at Topiax.
Want the complete 8-step roadmap?
Grab the free AI Agent Starter Kit — architecture templates, compliance checklists, and a 7-email deep-dive course.
Get the Starter Kit