How to Build a claims processing Agent Using LangChain in Python for healthcare

By Cyprian AaronsUpdated 2026-04-21
claims-processinglangchainpythonhealthcare

A claims processing agent for healthcare reads incoming claim documents, extracts the relevant fields, checks them against payer rules, flags missing or inconsistent data, and routes the case for approval, denial, or human review. It matters because claims are high-volume, highly regulated, and expensive to process manually; a good agent reduces turnaround time without turning compliance into an afterthought.

Architecture

  • Document ingestion layer

    • Accepts PDFs, EOBs, HL7/FHIR payloads, or structured claim JSON.
    • Normalizes input into text plus metadata like patient ID, payer, service date, and source system.
  • Extraction chain

    • Uses ChatOpenAI with a structured output schema to pull fields such as CPT/HCPCS codes, ICD-10 codes, provider NPI, authorization numbers, and billed amount.
    • Returns typed data instead of free-form text.
  • Policy validation layer

    • Applies payer-specific rules and internal policy checks.
    • Verifies coverage dates, code compatibility, prior authorization requirements, and duplicate claim indicators.
  • Decision router

    • Classifies the claim into approve, deny, or manual_review.
    • Keeps the decision logic explicit so auditors can trace why a claim was routed.
  • Audit logging layer

    • Stores prompt inputs, model outputs, validation results, and final decision.
    • Needed for healthcare compliance, dispute handling, and internal QA.
  • Human escalation path

    • Sends ambiguous or high-risk claims to a human reviewer.
    • Prevents the agent from making unsupported decisions on incomplete evidence.

Implementation

1) Define the claim schema

Use Pydantic so the model output is constrained. In claims workflows, you want structured data first and reasoning second.

from typing import Optional
from pydantic import BaseModel, Field

class ClaimExtraction(BaseModel):
    patient_id: str = Field(description="Internal patient identifier")
    payer_name: str = Field(description="Health plan name")
    provider_npi: str = Field(description="National Provider Identifier")
    cpt_code: str = Field(description="Primary procedure code")
    icd10_code: str = Field(description="Primary diagnosis code")
    service_date: str = Field(description="Date of service in YYYY-MM-DD")
    billed_amount: float = Field(description="Total charged amount")
    prior_auth_number: Optional[str] = Field(default=None)

2) Build an extraction chain with LangChain

This pattern uses ChatOpenAI plus with_structured_output(). That gives you typed outputs that are easier to validate than raw generations.

import os
from langchain_openai import ChatOpenAI
from langchain_core.prompts import ChatPromptTemplate

llm = ChatOpenAI(
    model="gpt-4o-mini",
    temperature=0,
    api_key=os.environ["OPENAI_API_KEY"],
)

prompt = ChatPromptTemplate.from_messages([
    ("system", "Extract claim fields from healthcare documents. Return only valid structured data."),
    ("human", "Claim document:\n\n{document_text}")
])

extraction_chain = prompt | llm.with_structured_output(ClaimExtraction)

document_text = """
Patient ID: P12345
Payer: Acme Health
Provider NPI: 1234567890
CPT: 99213
ICD-10: J06.9
Service Date: 2025-03-12
Billed Amount: 125.00
Prior Auth: PA-77821
"""

claim = extraction_chain.invoke({"document_text": document_text})
print(claim.model_dump())

3) Add deterministic policy checks

Do not ask the LLM to be your final adjudicator. Use Python rules for anything that should be stable and auditable.

from datetime import datetime

def validate_claim(claim: ClaimExtraction) -> list[str]:
    issues = []

    if not claim.provider_npi.isdigit() or len(claim.provider_npi) != 10:
        issues.append("Invalid NPI format")

    try:
        datetime.strptime(claim.service_date, "%Y-%m-%d")
    except ValueError:
        issues.append("Invalid service date")

    if claim.billed_amount <= 0:
        issues.append("Billed amount must be positive")

    if claim.cpt_code == "99213" and not claim.prior_auth_number:
        issues.append("Missing prior auth for office visit code")

    return issues

issues = validate_claim(claim)
decision = "manual_review" if issues else "approve"
print({"decision": decision, "issues": issues})

4) Wrap it in a simple processing pipeline

This is the pattern you actually deploy: extract → validate → route → log. Keep each stage isolated so you can swap models without rewriting business logic.

import json
from datetime import datetime

def process_claim(document_text: str):
    extracted = extraction_chain.invoke({"document_text": document_text})
    issues = validate_claim(extracted)

    result = {
        "timestamp": datetime.utcnow().isoformat(),
        "extracted": extracted.model_dump(),
        "issues": issues,
        "decision": "manual_review" if issues else "approve",
    }

    with open("claim_audit_log.jsonl", "a") as f:
        f.write(json.dumps(result) + "\n")

    return result

result = process_claim(document_text)
print(result["decision"])

Production Considerations

  • Encrypt PHI in transit and at rest

    • Claims data contains protected health information. Use TLS everywhere and store logs in encrypted systems with strict access controls.
    • Never dump raw PHI into debug logs.
  • Keep audit trails immutable

    • Store prompt versions, model versions, extracted fields, validation outcomes, and final decisions.
    • You need this for appeals, internal audits, and compliance reviews.
  • Respect data residency

    • If your healthcare organization requires regional storage or processing boundaries, pin model endpoints and storage to approved regions.
    • This is not optional when contracts or regulations restrict cross-border PHI movement.
  • Add human-in-the-loop thresholds

    • Route low-confidence extractions or policy conflicts to reviewers.
    • For claims work, false approvals are usually more expensive than slower handling.

Common Pitfalls

  • Using the LLM as the final adjudicator

    • Mistake: asking the model to “decide” approval based on vague instructions.
    • Fix: use deterministic rule checks for policy enforcement and reserve the model for extraction and classification support.
  • Skipping schema enforcement

    • Mistake: accepting free-form text from the model.
    • Fix: use with_structured_output() with Pydantic models so bad outputs fail fast instead of corrupting downstream systems.
  • Ignoring PHI handling in logs

    • Mistake: writing full prompts and responses to standard application logs.
    • Fix: redact sensitive fields before logging and store audit records in a controlled compliance system with role-based access.

Keep learning

By Cyprian Aarons, AI Consultant at Topiax.

Want the complete 8-step roadmap?

Grab the free AI Agent Starter Kit — architecture templates, compliance checklists, and a 7-email deep-dive course.

Get the Starter Kit

Related Guides