How to Integrate Haystack for healthcare with Elasticsearch for multi-agent systems

By Cyprian AaronsUpdated 2026-04-21

haystack-for-healthcareelasticsearchmulti-agent-systems

When you combine Haystack for healthcare with Elasticsearch, you get a practical stack for medical AI agents that need both structured retrieval and fast semantic search. That matters when one agent is triaging patient questions, another is pulling clinical guidelines, and a third is searching policy or lab-result history across large document sets.

This setup is useful for multi-agent systems because Elasticsearch gives you durable indexing, filtering, and retrieval at scale, while Haystack for healthcare handles the orchestration layer around clinical knowledge workflows. The result is a system that can route patient context, retrieve relevant records, and feed grounded answers to downstream agents.

Prerequisites

Before wiring this up, make sure you have:

•Python 3.10+
•An Elasticsearch cluster running locally or in Elastic Cloud
•A Haystack environment installed with the healthcare components you need
•API credentials for Elasticsearch if you are not using local dev mode
•
Access to your healthcare documents in a supported format:
- •PDFs
- •text notes
- •clinical summaries
- •FHIR-exported content if your pipeline supports it
•A vector embedding model available for semantic retrieval

Install the core packages:

pip install haystack-ai elasticsearch sentence-transformers

If your Haystack healthcare package is distributed separately in your environment, install that too according to your internal package source.

Integration Steps

1) Connect to Elasticsearch

Start by creating an Elasticsearch client. For local development, this is usually enough.

from elasticsearch import Elasticsearch

es = Elasticsearch("http://localhost:9200")

print(es.info())

For production, pass credentials and TLS settings explicitly.

from elasticsearch import Elasticsearch

es = Elasticsearch(
    "https://your-cluster.example.com:9243",
    basic_auth=("elastic", "your-password"),
    verify_certs=True,
)

print(es.ping())

You want this connection working before touching Haystack. If the cluster is unhealthy here, the rest of the pipeline will fail later in less obvious ways.

2) Build a Haystack document pipeline for healthcare content

Use Haystack to load and prepare medical documents before indexing them. In practice, this means reading source files, cleaning them, and converting them into Document objects.

from haystack import Document

documents = [
    Document(
        content="Patient education note: hypertension management includes sodium restriction and medication adherence.",
        meta={"patient_id": "p001", "doc_type": "education_note"}
    ),
    Document(
        content="Clinical guideline: first-line treatment for uncomplicated hypertension often starts with lifestyle changes.",
        meta={"source": "guideline", "specialty": "cardiology"}
    ),
]

If you are using Haystack pipelines for ingestion, keep the transformation step separate from indexing. That makes it easier to add PHI redaction or specialty-specific routing later.

3) Index documents into Elasticsearch with embeddings

For multi-agent search, store both text and vectors. Haystack can generate embeddings through its retriever components or embedding models in your pipeline; Elasticsearch then stores them for hybrid retrieval.

Here’s a direct example using the Elasticsearch Python client to create an index and store embedded documents:

from sentence_transformers import SentenceTransformer
from elasticsearch import Elasticsearch

model = SentenceTransformer("sentence-transformers/all-MiniLM-L6-v2")
es = Elasticsearch("http://localhost:9200")

index_name = "healthcare_docs"

if not es.indices.exists(index=index_name):
    es.indices.create(
        index=index_name,
        mappings={
            "properties": {
                "content": {"type": "text"},
                "patient_id": {"type": "keyword"},
                "doc_type": {"type": "keyword"},
                "embedding": {"type": "dense_vector", "dims": 384}
            }
        }
    )

for doc in documents:
    embedding = model.encode(doc.content).tolist()
    es.index(
        index=index_name,
        document={
            "content": doc.content,
            **doc.meta,
            "embedding": embedding,
        },
    )

es.indices.refresh(index=index_name)

For production systems, use bulk indexing instead of single-document writes. That reduces overhead when ingesting thousands of notes or claims records.

4) Query Elasticsearch from a Haystack-powered agent workflow

Now wire retrieval into your agent layer. In a multi-agent setup, one agent can call Elasticsearch directly while another uses Haystack orchestration to decide which query to run based on intent.

A simple retrieval function looks like this:

import numpy as np
from sentence_transformers import SentenceTransformer
from elasticsearch import Elasticsearch

model = SentenceTransformer("sentence-transformers/all-MiniLM-L6-v2")
es = Elasticsearch("http://localhost:9200")

def search_healthcare_docs(query: str, top_k: int = 3):
    query_vector = model.encode(query).tolist()

    response = es.search(
        index="healthcare_docs",
        knn={
            "field": "embedding",
            "query_vector": query_vector,
            "k": top_k,
            "num_candidates": 50,
        },
        _source=["content", "patient_id", "doc_type", "source", "specialty"],
    )
    return response["hits"]["hits"]

results = search_healthcare_docs("What are first-line treatments for hypertension?")
for hit in results:
    print(hit["_source"]["content"])

If you are using Haystack Agents or pipelines around this search function, treat it as a tool call. One agent can retrieve context; another can summarize it into a patient-safe answer; a third can check policy constraints before output.

5) Add a coordination layer for multi-agent use

The key integration pattern is not just “Haystack talks to Elasticsearch.” It’s “agents share one retrieval substrate.” Keep the search tool stateless so multiple agents can call it without stepping on each other.

A simple coordinator might look like this:

def triage_agent(query: str):
    docs = search_healthcare_docs(query)
    return {
        "agent": "triage",
        "retrieved_docs": [hit["_source"] for hit in docs],
    }

def summarizer_agent(retrieval_result):
    snippets = [doc["content"] for doc in retrieval_result["retrieved_docs"]]
    return {
        "agent": "summarizer",
        "summary": "\n".join(snippets[:2])
    }

state = triage_agent("hypertension follow-up guidance")
final_output = summarizer_agent(state)
print(final_output["summary"])

That pattern scales well because each agent has one job:

•retrieve
•summarize
•validate
•escalate

Testing the Integration

Run a basic end-to-end test by indexing one known document and querying it back.

test_query = "hypertension management"
hits = search_healthcare_docs(test_query)

assert len(hits) > 0
assert any("hypertension" in hit["_source"]["content"].lower() for hit in hits)

print("Integration test passed")
for hit in hits:
    print(hit["_source"]["content"])

Expected output:

Integration test passed
Clinical guideline: first-line treatment for uncomplicated hypertension often starts with lifestyle changes.
Patient education note: hypertension management includes sodium restriction and medication adherence.

If you get zero hits, check these first:

•index name matches exactly
•embeddings were stored correctly
•vector dimensions match the model output
•Elasticsearch kNN support is enabled on your cluster version

Real-World Use Cases

•
Clinical knowledge assistant
- •One agent retrieves guideline snippets from Elasticsearch.
- •Another agent uses Haystack to synthesize an answer with citations.
•
Patient support workflow
- •A triage agent searches prior notes or care plans.
- •A compliance agent checks that the response stays within approved language.
•
Care coordination across systems
- •Agents query discharge summaries, referral notes, and lab interpretations.
- •The shared retrieval layer keeps responses grounded across teams and specialties.

This integration works best when you treat Elasticsearch as the durable memory layer and Haystack as the orchestration layer around it. That gives multi-agent healthcare systems a clean separation between storage, retrieval, and reasoning.

Keep learning

•The complete AI Agents Roadmap — my full 8-step breakdown
•Free: The AI Agent Starter Kit — PDF checklist + starter code
•Work with me — I build AI for banks and insurance companies

By Cyprian Aarons, AI Consultant at Topiax.

ShareX / Twitter LinkedIn

Want the complete 8-step roadmap?

Grab the free AI Agent Starter Kit — architecture templates, compliance checklists, and a 7-email deep-dive course.

Get the Starter Kit