How to Build a compliance checking Agent Using LangChain in Python for healthcare

By Cyprian AaronsUpdated 2026-04-21
compliance-checkinglangchainpythonhealthcare

A compliance checking agent for healthcare reviews text, policies, and workflow outputs against rules like HIPAA, internal PHI handling standards, and jurisdiction-specific retention requirements. It matters because a bad classification, a missed disclosure, or sending PHI to the wrong place can become a reportable incident, not just a bug.

Architecture

  • Input layer

    • Accepts clinical notes, messages, policy text, or workflow outputs.
    • Normalizes source metadata like region, tenant, document type, and sensitivity.
  • Policy retrieval layer

    • Pulls the relevant compliance rules from a curated knowledge base.
    • Uses langchain_community.vectorstores with embeddings to retrieve the right policy section for the request.
  • Compliance reasoning chain

    • Uses ChatPromptTemplate plus a chat model to compare the input against policy.
    • Produces structured findings: pass/fail, risk level, violated rule, and recommended fix.
  • Structured output parser

    • Forces the agent to return machine-readable JSON.
    • Makes it easier to route high-risk cases to human review.
  • Audit logging layer

    • Stores prompt, retrieved policy snippets, model output, timestamp, and request metadata.
    • Needed for traceability during internal audits and incident review.
  • Guardrail layer

    • Blocks unsafe actions like exposing raw PHI in logs or sending data outside approved regions.
    • Enforces redaction and escalation rules before any downstream action.

Implementation

1) Define the compliance schema

Use Pydantic so your agent returns deterministic output. This is much better than parsing free-form prose after the fact.

from typing import List
from pydantic import BaseModel, Field

class ComplianceFinding(BaseModel):
    compliant: bool = Field(description="Whether the content passes compliance checks")
    risk_level: str = Field(description="low, medium, high")
    violated_rules: List[str] = Field(description="List of violated policy references")
    rationale: str = Field(description="Short explanation of the decision")
    remediation: List[str] = Field(description="Concrete fixes")

2) Build the LangChain chain with retrieval + structured output

This pattern uses ChatPromptTemplate, RunnablePassthrough, and PydanticOutputParser. The retriever supplies only the relevant policy text for the current case.

from langchain_core.prompts import ChatPromptTemplate
from langchain_core.runnables import RunnablePassthrough
from langchain_core.output_parsers import PydanticOutputParser
from langchain_openai import ChatOpenAI

parser = PydanticOutputParser(pydantic_object=ComplianceFinding)

prompt = ChatPromptTemplate.from_messages([
    ("system",
     "You are a healthcare compliance checker. "
     "Assess content against provided policy context. "
     "Never expose PHI in your answer."),
    ("human",
     "Policy context:\n{policy_context}\n\n"
     "Request metadata:\n{metadata}\n\n"
     "Content to review:\n{content}\n\n"
     "{format_instructions}")
]).partial(format_instructions=parser.get_format_instructions())

llm = ChatOpenAI(model="gpt-4o-mini", temperature=0)

def format_docs(docs):
    return "\n\n".join(doc.page_content for doc in docs)

chain = (
    {
        "policy_context": lambda x: format_docs(x["docs"]),
        "metadata": RunnablePassthrough(),
        "content": RunnablePassthrough(),
    }
    | prompt
    | llm
    | parser
)

3) Wire in a retriever for healthcare policy documents

Store your policies as documents with metadata like region and policy version. Then retrieve only what applies to the request.

from langchain_community.vectorstores import FAISS
from langchain_openai import OpenAIEmbeddings
from langchain_core.documents import Document

docs = [
    Document(
        page_content="HIPAA minimum necessary rule: access only the least amount of PHI required.",
        metadata={"policy_id": "HIPAA-001", "region": "US"}
    ),
    Document(
        page_content="PHI must not be logged in plaintext or sent to non-approved vendors.",
        metadata={"policy_id": "PHI-LOG-002", "region": "US"}
    ),
]

embeddings = OpenAIEmbeddings(model="text-embedding-3-small")
vectorstore = FAISS.from_documents(docs, embeddings)
retriever = vectorstore.as_retriever(search_kwargs={"k": 2})

request = {
    "content": "Please email the patient summary including diagnosis and medication list to my personal Gmail.",
    "metadata": {"region": "US", "doc_type": "message"}
}

retrieved_docs = retriever.invoke(request["content"])
result = chain.invoke({
    "docs": retrieved_docs,
    "content": request["content"],
    "metadata": request["metadata"],
})

print(result.model_dump())

4) Add an audit trail before returning results

For healthcare workflows, you need evidence of what was checked and why. Log redacted inputs plus policy references and final findings.

import json
from datetime import datetime

def audit_event(request_id: str, request: dict, docs: list[Document], finding: ComplianceFinding):
    record = {
        "request_id": request_id,
        "timestamp_utc": datetime.utcnow().isoformat(),
        "metadata": request["metadata"],
        "content_redacted_preview": request["content"][:120],
        "policy_ids": [d.metadata.get("policy_id") for d in docs],
        "finding": finding.model_dump(),
    }
    print(json.dumps(record))

audit_event("req-123", request, retrieved_docs, result)

Production Considerations

  • Data residency

    • Keep embeddings, vector stores, and model inference in approved regions.
    • If your organization requires US-only processing for PHI-related workloads, enforce that at deployment time rather than in code comments.
  • PHI handling

    • Redact identifiers before logging prompts or responses.
    • Never send raw PHI into observability tools unless they are explicitly approved for healthcare data.
  • Monitoring

    • Track false positives and false negatives by policy category.
    • Measure how often cases are escalated to humans versus auto-approved.
  • Guardrails

    • Block any response that includes patient identifiers unless the caller is authorized.
    • Require human review for high-risk findings such as disclosure violations or cross-border transfer issues.

Common Pitfalls

  1. Using a generic chatbot prompt instead of policy-grounded reasoning

    • This leads to inconsistent decisions and hallucinated compliance advice.
    • Fix it by always passing retrieved policy text into the prompt and forcing structured output with PydanticOutputParser.
  2. Logging raw clinical content

    • Teams do this during debugging and accidentally create a secondary PHI exposure path.
    • Fix it by redacting inputs before logs and storing only minimal previews plus document IDs.
  3. Skipping jurisdiction-aware routing

    • HIPAA is not your only constraint; state law, retention rules, and vendor agreements matter too.
    • Fix it by attaching region metadata to every request and filtering retrievers by region-specific policy sets before evaluation.

Keep learning

By Cyprian Aarons, AI Consultant at Topiax.

Want the complete 8-step roadmap?

Grab the free AI Agent Starter Kit — architecture templates, compliance checklists, and a 7-email deep-dive course.

Get the Starter Kit

Related Guides