How to Build a compliance checking Agent Using LlamaIndex in TypeScript for healthcare

By Cyprian AaronsUpdated 2026-04-21
compliance-checkingllamaindextypescripthealthcare

A compliance checking agent in healthcare reads clinical, operational, or patient-facing text and flags whether it violates policy, regulation, or internal controls. That matters because a bad response can expose PHI, create audit risk, or trigger a regulatory issue under HIPAA, local data residency rules, or internal medical policy.

Architecture

  • Policy corpus ingestion
    • Load HIPAA policies, internal SOPs, retention rules, and approved clinical language into a vector index.
  • Retrieval layer
    • Use VectorStoreIndex plus a retriever to pull the most relevant policy snippets for each user prompt or generated response.
  • Compliance evaluator
    • Run an LLM-backed checker that compares the candidate text against retrieved policy context and returns pass/fail plus reasons.
  • Audit logger
    • Persist every decision with prompt hash, retrieved sources, model version, timestamp, and reviewer metadata.
  • Redaction / PHI guardrail
    • Detect and mask PHI before storage or downstream routing when the agent sees patient data.
  • Escalation path
    • Route uncertain cases to a human compliance reviewer instead of auto-approving.

Implementation

1) Install dependencies and load policy documents

Use LlamaIndex TS with a local or private model endpoint. For healthcare, keep the model and vector store inside your approved environment.

npm install llamaindex
import {
  Document,
  VectorStoreIndex,
  Settings,
  OpenAI,
} from "llamaindex";

Settings.llm = new OpenAI({
  model: "gpt-4o-mini",
  apiKey: process.env.OPENAI_API_KEY!,
});

const policyDocs = [
  new Document({
    text: `
      HIPAA rule: Do not disclose PHI unless the requester is authorized.
      Minimum necessary standard applies to all disclosures.
      Patient identifiers must not be included in unapproved channels.
    `,
    metadata: { source: "hipaa_policy_v1", type: "policy" },
  }),
  new Document({
    text: `
      Internal policy: Any outbound message mentioning diagnosis,
      treatment plan, medication changes, or lab results requires review
      if sent outside the EHR system.
    `,
    metadata: { source: "clinical_comm_policy_v3", type: "policy" },
  }),
];

const index = await VectorStoreIndex.fromDocuments(policyDocs);
const retriever = index.asRetriever({ similarityTopK: 3 });

2) Build a compliance check function with retrieval + structured output

The pattern here is simple: retrieve relevant policy context first, then ask the LLM for a decision grounded in that context. Don’t let the model judge compliance from memory alone.

import { QueryEngineTool } from "llamaindex";

type ComplianceResult = {
  decision: "pass" | "fail" | "needs_review";
  reasons: string[];
  citedPolicies: string[];
};

async function checkCompliance(candidateText: string): Promise<ComplianceResult> {
  const nodes = await retriever.retrieve(candidateText);
  const policyContext = nodes
    .map((n) => `SOURCE=${n.node.metadata?.source}\nTEXT=${n.node.getContent()}`)
    .join("\n\n");

  const prompt = `
You are a healthcare compliance checker.

Evaluate the candidate text against the provided policy context only.
Return strict JSON with keys:
decision ("pass" | "fail" | "needs_review"),
reasons (array of strings),
citedPolicies (array of source ids).

Policy context:
${policyContext}

Candidate text:
${candidateText}
`;

  const response = await Settings.llm.complete({ prompt });
  return JSON.parse(response.text) as ComplianceResult;
}

const result = await checkCompliance(
  "Please email me the patient's latest lab results and diagnosis."
);

console.log(result);

3) Add an agent wrapper for routing and escalation

For production use, wrap the checker so it can decide between auto-pass, block, or human review. This is where you enforce your healthcare workflow.

async function routeComplianceDecision(text: string) {
  const result = await checkCompliance(text);

  if (result.decision === "pass") {
    return { action: "allow", result };
  }

  if (result.decision === "fail") {
    return { action: "block", result };
  }

  return {
    action: "escalate",
    queue: "compliance-review",
    result,
  };
}

const routed = await routeComplianceDecision(
  "Summarize the patient's discharge instructions for external email."
);

console.log(routed);

4) Store audit records for traceability

Healthcare teams need to answer who checked what, when, using which policies. Keep immutable logs with request hashes and source references.

import crypto from "crypto";

function hashText(input: string) {
  return crypto.createHash("sha256").update(input).digest("hex");
}

async function auditLog(inputText: string, decision: ComplianceResult) {
  const record = {
    requestHash: hashText(inputText),
    decision,
    model: "gpt-4o-mini",
    timestamp: new Date().toISOString(),
    domain: "healthcare-compliance",
  };

  console.log(JSON.stringify(record));
}

const text = "Can you send this patient's medication list to my personal email?";
const decision = await checkCompliance(text);
await auditLog(text, decision);

Production Considerations

  • Keep data residency explicit
    • Store policies, embeddings, logs, and prompts in-region. If your hospital requires EU-only or US-only processing, enforce that at the infrastructure layer.
  • Treat every output as auditable
    • Log retrieved policy IDs, decision reason codes, model version, and request hash. Avoid storing raw PHI unless your retention policy allows it.
  • Add hard guardrails before generation
    • Redact PHI where possible and block requests that clearly violate access control before they reach the LLM.
  • Use human review for ambiguous cases
    • Anything involving diagnosis disclosure, minors, behavioral health notes, or cross-border transfer should default to needs_review.

Common Pitfalls

  • Using the LLM without retrieval

    If you ask the model “is this compliant?” without grounding it in current policy text, you get vague judgments. Always retrieve policy snippets first with VectorStoreIndex and pass them into the evaluation prompt.

  • Logging raw patient data everywhere

    Teams often dump prompts into observability tools. That creates unnecessary PHI exposure. Hash requests, redact sensitive fields, and restrict access to audit logs.

  • Auto-approving borderline cases

    A message that looks harmless can still violate minimum necessary rules or disclosure constraints. Use needs_review for uncertainty instead of forcing pass/fail.

  • Ignoring versioning of policies

    Compliance changes over time. Tag every document with version metadata and rebuild indexes when policies change so decisions reflect current rules.


Keep learning

By Cyprian Aarons, AI Consultant at Topiax.

Want the complete 8-step roadmap?

Grab the free AI Agent Starter Kit — architecture templates, compliance checklists, and a 7-email deep-dive course.

Get the Starter Kit

Related Guides