How to Integrate LangGraph for healthcare with LangSmith for RAG

By Cyprian AaronsUpdated 2026-04-22
langgraph-for-healthcarelangsmithrag

Combining LangGraph for healthcare with LangSmith gives you a controlled way to build retrieval-augmented agents that can reason over clinical content, keep workflow state, and produce traces you can actually debug. The practical win is simple: you get a healthcare-specific graph for orchestrating patient-safe steps, plus observability for every retrieval, prompt, and model decision in the RAG pipeline.

Prerequisites

  • Python 3.10+
  • langgraph
  • langchain
  • langsmith
  • A supported LLM provider key, such as OPENAI_API_KEY
  • A LangSmith API key: LANGSMITH_API_KEY
  • A LangSmith project name configured in your environment
  • Access to your healthcare knowledge base:
    • policy docs
    • care pathways
    • clinical FAQs
    • internal triage guidance
  • Basic familiarity with:
    • LangGraph state graphs
    • retrievers/vector stores
    • LangChain runnables

Install the packages:

pip install langgraph langchain langsmith langchain-openai

Set environment variables:

export OPENAI_API_KEY="your-openai-key"
export LANGSMITH_API_KEY="your-langsmith-key"
export LANGSMITH_PROJECT="healthcare-rag-agent"
export LANGCHAIN_TRACING_V2="true"

Integration Steps

  1. Create a healthcare RAG retriever and instrument it for tracing

    Start with a vector store backed retriever. In healthcare, keep the corpus scoped to approved documents only. LangSmith will trace the retrieval calls once tracing is enabled through environment variables.

from langchain_openai import OpenAIEmbeddings, ChatOpenAI
from langchain_community.vectorstores import FAISS
from langchain.text_splitter import RecursiveCharacterTextSplitter
from langchain_core.documents import Document

docs = [
    Document(page_content="Adult asthma escalation pathway: use rescue inhaler first, then assess severity."),
    Document(page_content="Chest pain red flags: refer to emergency services if symptoms are acute or severe."),
]

splitter = RecursiveCharacterTextSplitter(chunk_size=500, chunk_overlap=50)
chunks = splitter.split_documents(docs)

embeddings = OpenAIEmbeddings()
vectorstore = FAISS.from_documents(chunks, embeddings)
retriever = vectorstore.as_retriever(search_kwargs={"k": 3})

llm = ChatOpenAI(model="gpt-4o-mini", temperature=0)
  1. Define a LangGraph state model for RAG

    Use a typed state so the graph can carry the user query, retrieved context, and final answer. For healthcare workflows, this keeps the agent deterministic and easier to audit.

from typing import TypedDict, List
from langchain_core.documents import Document

class HealthcareRAGState(TypedDict):
    question: str
    context: List[Document]
    answer: str
  1. Build graph nodes for retrieval and generation

    This is where LangGraph handles orchestration. The retrieval node fetches relevant clinical snippets, and the generation node uses them to produce an answer grounded in source text.

from langgraph.graph import StateGraph, START, END

def retrieve(state: HealthcareRAGState):
    docs = retriever.invoke(state["question"])
    return {"context": docs}

def generate(state: HealthcareRAGState):
    context_text = "\n\n".join(doc.page_content for doc in state["context"])
    prompt = f"""
You are a healthcare assistant.
Answer only from the provided context.
If the answer is not present, say you don't have enough information.

Question: {state['question']}

Context:
{context_text}
"""
    response = llm.invoke(prompt)
    return {"answer": response.content}

graph = StateGraph(HealthcareRAGState)
graph.add_node("retrieve", retrieve)
graph.add_node("generate", generate)
graph.add_edge(START, "retrieve")
graph.add_edge("retrieve", "generate")
graph.add_edge("generate", END)

app = graph.compile()
  1. Connect the graph run to LangSmith tracing

    If LANGCHAIN_TRACING_V2=true and LANGSMITH_API_KEY are set, LangChain/LangGraph runs will show up in LangSmith automatically. For production systems, also tag runs with metadata like tenant, workflow name, or document version.

result = app.invoke(
    {"question": "What should I do for adult asthma symptoms?"},
    config={
        "tags": ["healthcare", "rag", "triage"],
        "metadata": {
            "source": "internal_knowledge_base",
            "doc_version": "2026-04",
        },
        "run_name": "healthcare-rag-query",
    },
)

print(result["answer"])
  1. Add explicit LangSmith logging for evaluation or debugging

    When you need deeper control over experiments or offline evaluation, use the LangSmith client directly. This is useful when comparing prompt versions or tracking test cases across releases.

from langsmith import Client

client = Client()

client.create_dataset(
    dataset_name="healthcare-rag-evals",
    description="Test set for healthcare RAG answers",
)

client.create_example(
    inputs={"question": "What are chest pain red flags?"},
    outputs={"expected": "Emergency referral if acute or severe."},
    dataset_name="healthcare-rag-evals",
)

Testing the Integration

Run one query through the graph and confirm it returns an answer grounded in your documents. Then open LangSmith and inspect the trace tree; you should see retrieval and generation steps as separate spans.

test_input = {"question": "When should chest pain be escalated?"}
output = app.invoke(
    test_input,
    config={
        "tags": ["smoke-test"],
        "metadata": {"env": "local"},
        "run_name": "integration-smoke-test",
    },
)

print("ANSWER:", output["answer"])

Expected output:

ANSWER: Chest pain should be escalated to emergency services if symptoms are acute or severe.

If tracing is working, LangSmith will show:

  • one root run for integration-smoke-test
  • one child run for retrieval
  • one child run for generation
  • tags and metadata attached to the run

Real-World Use Cases

  • Clinical policy assistant

    • Route questions through a healthcare workflow that retrieves approved policy documents before answering.
    • Use LangSmith traces to audit which policy chunks were used in each response.
  • Patient support triage bot

    • Build a graph that classifies urgency, retrieves symptom guidance, and generates safe next steps.
    • Track failure cases in LangSmith and compare prompt versions against incident tickets.
  • Internal care-team knowledge search

    • Let staff ask questions over discharge instructions, care pathways, or benefits documentation.
    • Use LangSmith datasets to benchmark answer quality before shipping changes to production.

Keep learning

By Cyprian Aarons, AI Consultant at Topiax.

Want the complete 8-step roadmap?

Grab the free AI Agent Starter Kit — architecture templates, compliance checklists, and a 7-email deep-dive course.

Get the Starter Kit

Related Guides