How to Integrate Haystack for insurance with Elasticsearch for multi-agent systems
Combining Haystack for insurance with Elasticsearch gives you a practical retrieval layer for multi-agent systems: one agent can search claims, policies, and underwriting notes while another agent reasons over the retrieved evidence. That setup is useful when you need traceable answers, fast lookup across large document sets, and a clean way to share context between agents.
Prerequisites
- •Python 3.10+
- •An Elasticsearch cluster running locally or in your cloud environment
- •Access to your Haystack for insurance package and API credentials if your setup requires them
- •
pipinstalled - •A corpus of insurance documents to index:
- •policy PDFs
- •claims notes
- •underwriting guidelines
- •customer correspondence
- •Environment variables configured:
- •
ELASTICSEARCH_URL - •
ELASTICSEARCH_API_KEYif applicable
- •
Install the core dependencies:
pip install haystack-ai elasticsearch sentence-transformers
Integration Steps
- •
Connect to Elasticsearch and create a document store
Start by initializing the Elasticsearch client and the Haystack document store. In a production setup, keep index names explicit so multiple agents don’t collide on the same data.
from elasticsearch import Elasticsearch
from haystack_integrations.document_stores.elasticsearch import ElasticsearchDocumentStore
es_client = Elasticsearch(
hosts=["http://localhost:9200"],
api_key=None, # set if your cluster requires auth
)
document_store = ElasticsearchDocumentStore(
hosts=["http://localhost:9200"],
index="insurance_docs",
embedding_dim=384,
)
- •
Load insurance documents into Haystack
Use Haystack’s
Documentobjects so downstream components can preserve metadata like claim ID, policy number, and document type. For insurance systems, that metadata is what makes retrieval useful to agents.
from haystack import Document
docs = [
Document(
content="Policy P-1001 covers water damage up to $25,000 with a $500 deductible.",
meta={"policy_id": "P-1001", "doc_type": "policy"}
),
Document(
content="Claim C-7782 was denied because the loss was caused by excluded flood damage.",
meta={"claim_id": "C-7782", "doc_type": "claim_note"}
),
]
document_store.write_documents(docs)
- •
Create an embedding pipeline and index the documents
For semantic search, generate embeddings before writing or updating your index. In Haystack pipelines, this is usually handled with an embedder plus a writer connected to Elasticsearch.
from haystack import Pipeline
from haystack.components.embedders import SentenceTransformersDocumentEmbedder
from haystack.components.writers import DocumentWriter
embedder = SentenceTransformersDocumentEmbedder(model="sentence-transformers/all-MiniLM-L6-v2")
writer = DocumentWriter(document_store=document_store)
indexing_pipeline = Pipeline()
indexing_pipeline.add_component("embedder", embedder)
indexing_pipeline.add_component("writer", writer)
indexing_pipeline.connect("embedder.documents", "writer.documents")
result = indexing_pipeline.run({"embedder": {"documents": docs}})
print(result)
- •
Add a retriever for multi-agent query handling
In a multi-agent system, one agent typically handles retrieval while another handles reasoning or task execution. Use an Elasticsearch-backed retriever so agents can fetch evidence from the same shared index.
from haystack.components.retrievers import InMemoryBM25Retriever
# If you want lexical search directly against Haystack-managed docs:
bm25_retriever = InMemoryBM25Retriever(document_store=document_store)
query = "Which policy covers water damage?"
retrieved = bm25_retriever.run(query=query, top_k=3)
for doc in retrieved["documents"]:
print(doc.content, doc.meta)
If you want semantic retrieval from Elasticsearch-backed vectors, use the retriever that matches your Haystack version and integration package. The pattern stays the same: query → retrieve → pass context to the next agent.
- •
Wire retrieval into an agent workflow
The cleanest pattern is to make retrieval a dedicated tool agent. That agent queries Elasticsearch through Haystack and returns structured context to the decision-making agent.
def retrieve_insurance_context(question: str):
result = bm25_retriever.run(query=question, top_k=5)
return [
{
"content": doc.content,
"meta": doc.meta,
}
for doc in result["documents"]
]
context = retrieve_insurance_context("Does policy P-1001 cover water damage?")
print(context)
Testing the Integration
Run a simple end-to-end check: write documents, query them, and inspect the returned metadata.
test_query = "What does policy P-1001 cover?"
results = bm25_retriever.run(query=test_query, top_k=2)
assert len(results["documents"]) > 0
for doc in results["documents"]:
print("CONTENT:", doc.content)
print("META:", doc.meta)
Expected output:
CONTENT: Policy P-1001 covers water damage up to $25,000 with a $500 deductible.
META: {'policy_id': 'P-1001', 'doc_type': 'policy'}
If that comes back cleanly, your Haystack-to-Elasticsearch path is working and ready for agent orchestration.
Real-World Use Cases
- •
Claims triage agent
- •Pulls claim notes and policy language from Elasticsearch through Haystack.
- •Flags likely denial reasons or missing documentation before a human adjuster reviews it.
- •
Underwriting copilot
- •Searches prior submissions, risk notes, and guideline documents.
- •Gives underwriters grounded answers with source metadata attached.
- •
Customer service escalation flow
- •One agent retrieves coverage evidence.
- •Another agent drafts a response using only approved policy text and claim history.
The main pattern here is simple: Elasticsearch stores and indexes the corpus, Haystack structures retrieval and pipelines it into agents. That division keeps your multi-agent system maintainable when document volume grows and when auditability matters.
Keep learning
- •The complete AI Agents Roadmap — my full 8-step breakdown
- •Free: The AI Agent Starter Kit — PDF checklist + starter code
- •Work with me — I build AI for banks and insurance companies
By Cyprian Aarons, AI Consultant at Topiax.
Want the complete 8-step roadmap?
Grab the free AI Agent Starter Kit — architecture templates, compliance checklists, and a 7-email deep-dive course.
Get the Starter Kit