How to Build a claims processing Agent Using LangChain in TypeScript for healthcare

By Cyprian AaronsUpdated 2026-04-21
claims-processinglangchaintypescripthealthcare

A claims processing agent for healthcare takes a claim, extracts the relevant policy and clinical details, checks eligibility and coverage rules, flags missing documentation, and routes the case to the right next step. It matters because claims are high-volume, high-cost, and highly regulated; small mistakes create denials, delays, compliance issues, and avoidable manual work.

Architecture

  • Claim intake layer

    • Receives structured inputs from EHR, payer portals, PDFs, or API payloads.
    • Normalizes fields like member ID, CPT/HCPCS codes, ICD-10 codes, dates of service, provider NPI, and attachments.
  • Document extraction layer

    • Uses OCR or document parsing before the LLM sees the data.
    • Converts clinical notes, referral letters, and itemized bills into text chunks with metadata.
  • LangChain reasoning layer

    • Uses ChatOpenAI plus prompts to classify claim type, identify missing fields, and generate next actions.
    • Keeps the model constrained to a fixed schema using StructuredOutputParser or tool-style outputs.
  • Rules and policy engine

    • Applies deterministic checks for eligibility windows, prior authorization requirements, coding mismatches, and benefit limits.
    • This should live outside the model so auditability stays clean.
  • Audit and case management layer

    • Stores every input, model output, rule decision, and human override.
    • Required for healthcare compliance reviews and denial appeals.
  • Human review queue

    • Escalates uncertain claims or low-confidence outputs to a claims analyst.
    • Prevents automated decisions on ambiguous or high-risk cases.

Implementation

1. Define the claim schema and output contract

Start by forcing the agent to produce a strict JSON shape. In healthcare workflows you want predictable fields you can validate before any downstream action.

import { z } from "zod";

export const ClaimSchema = z.object({
  claimId: z.string(),
  memberId: z.string(),
  providerNpi: z.string(),
  serviceDate: z.string(),
  procedureCodes: z.array(z.string()),
  diagnosisCodes: z.array(z.string()),
  missingFields: z.array(z.string()),
  riskLevel: z.enum(["low", "medium", "high"]),
  recommendedAction: z.enum([
    "auto_adjudicate",
    "request_more_info",
    "route_to_human_review",
    "deny_for_policy"
  ]),
});

export type ClaimReview = z.infer<typeof ClaimSchema>;

This schema is your contract. If the model returns malformed data, reject it before it touches adjudication logic.

2. Build the LangChain chain in TypeScript

Use LangChain’s actual classes: ChatOpenAI, PromptTemplate, StructuredOutputParser, and RunnableSequence. The model should only classify and summarize; it should not make final payment decisions.

import { ChatOpenAI } from "@langchain/openai";
import { PromptTemplate } from "@langchain/core/prompts";
import { StructuredOutputParser } from "@langchain/core/output_parsers";
import { RunnableSequence } from "@langchain/core/runnables";
import { ClaimSchema } from "./schema";

const parser = StructuredOutputParser.fromZodSchema(ClaimSchema);

const prompt = PromptTemplate.fromTemplate(`
You are a healthcare claims processing assistant.
Extract claim review data from the input below.

Rules:
- Do not invent missing values.
- If information is incomplete, list it in missingFields.
- Only use the allowed riskLevel and recommendedAction values.

{format_instructions}

Claim input:
{claimText}
`);

const llm = new ChatOpenAI({
  model: "gpt-4o-mini",
  temperature: 0,
});

export const claimReviewChain = RunnableSequence.from([
  async (input: { claimText: string }) => ({
    claimText: input.claimText,
    format_instructions: parser.getFormatInstructions(),
  }),
  prompt,
  llm,
]);

This keeps generation deterministic enough for operations. Temperature stays at zero because you want stable outputs for repeatable claim triage.

3. Add post-processing validation and business rules

The LLM can identify gaps, but your deterministic rules decide whether the case can move forward. That separation is important for auditability in healthcare.

import { ClaimSchema } from "./schema";
import { claimReviewChain } from "./chain";

function applyPolicyRules(review: ReturnType<typeof ClaimSchema.parse>) {
  if (review.missingFields.length > 0) {
    return { ...review, recommendedAction: "request_more_info" as const };
  }

  if (review.riskLevel === "high") {
    return { ...review, recommendedAction: "route_to_human_review" as const };
  }

  return review;
}

export async function processClaim(claimText: string) {
  const raw = await claimReviewChain.invoke({ claimText });
  
  const parsed = ClaimSchema.parse(JSON.parse(raw.content as string));
  
  const finalDecision = applyPolicyRules(parsed);

  return {
    ...finalDecision,
    auditTrail: {
      model: "gpt-4o-mini",
      timestamp: new Date().toISOString(),
      sourceLength: claimText.length,
    },
  };
}

In production you would also store the raw model output separately. Never overwrite source evidence with derived data.

4. Wire in retrieval for payer policies and benefits

Claims logic depends on plan-specific rules. Use retrieval to fetch policy snippets by payer, plan code, or state so the agent can explain why a claim needs review.

import { MemoryVectorStore } from "langchain/vectorstores/memory";
import { OpenAIEmbeddings } from "@langchain/openai";

async function buildPolicyStore(policyDocs: string[]) {
  return MemoryVectorStore.fromTexts(
    policyDocs,
    policyDocs.map((_, i) => ({ docId: `policy-${i}` })),
    new OpenAIEmbeddings()
);
}

Use retrieved policy text as context only. The final decision still comes from your rules engine plus human review where required.

Production Considerations

  • Compliance controls

    • Treat all PHI as sensitive by default.
    • Encrypt at rest and in transit.
    • Log access to claims data with user identity, purpose of access, and timestamp.
  • Data residency

    • Keep processing in-region when contracts require it.
    • If you operate across jurisdictions, route EU or country-specific health data to approved infrastructure only.
    • Verify your model provider’s storage and retention settings before sending PHI.
  • Monitoring

    • Track parse failure rate, human escalation rate, denial reason frequency, and hallucination rate against gold-label samples.
    • Alert when outputs drift on common claim types like outpatient surgery or durable medical equipment.
    • Sample decisions daily for QA review by claims specialists.
  • Guardrails

    • Block autonomous denials for ambiguous cases.
    • Require confidence thresholds plus rule checks before auto-adjudication.
    • Add redaction before prompts so unnecessary PHI never reaches the model.

Common Pitfalls

  1. Letting the LLM make final payment decisions

    • Don’t do this.
    • Use the model for extraction, classification, and explanation; use deterministic policy logic for adjudication.
  2. Skipping schema validation

    • If you accept free-form text from the model, you will eventually ship broken downstream records.
    • Validate with Zod after every call and fail closed on parse errors.
  3. Ignoring auditability

    • Healthcare teams need to explain why a claim was routed or denied.
    • Persist prompt version, retrieved policy snippets, raw output, parsed output, rule results, and human overrides in an immutable audit log.
  4. Mixing PHI into broad prompts

    • Only send what is needed for the task.
    • Redact names where possible and keep identifiers scoped to internal IDs unless exact matching is required.

A claims agent built this way does one job well: reduce manual triage without turning compliance into an afterthought. Keep LangChain focused on structured reasoning, keep policy decisions deterministic, and keep humans in the loop where risk is high.


Keep learning

By Cyprian Aarons, AI Consultant at Topiax.

Want the complete 8-step roadmap?

Grab the free AI Agent Starter Kit — architecture templates, compliance checklists, and a 7-email deep-dive course.

Get the Starter Kit

Related Guides