How to Integrate Haystack for insurance with Elasticsearch for AI agents
Combining Haystack for insurance with Elasticsearch gives you a practical retrieval layer for AI agents that need policy-aware answers, claim lookup, and document search over large insurance corpora. Haystack handles the agent orchestration and retrieval pipeline, while Elasticsearch gives you fast full-text search, filtering, and scoring across policy docs, claims notes, endorsements, and underwriting files.
Prerequisites
- •Python 3.10+
- •An Elasticsearch cluster running locally or in the cloud
- •A Haystack for insurance project installed in your environment
- •API credentials or access tokens if your Haystack deployment requires them
- •Insurance documents ready to index:
- •policy wordings
- •claims summaries
- •underwriting guidelines
- •broker correspondence
- •Python packages:
- •
haystack - •
elasticsearch - •any Haystack integration package your insurance stack uses for document ingestion and agents
- •
Integration Steps
- •Install the dependencies and verify both clients are available.
pip install haystack elasticsearch
from elasticsearch import Elasticsearch
from haystack import Document
from haystack.components.embedders import SentenceTransformersDocumentEmbedder
If your insurance setup exposes a specific Haystack package or pipeline wrapper, install that as well and keep the same pattern: Elasticsearch stores the corpus, Haystack builds the agent workflow on top.
- •Connect to Elasticsearch and create an index for insurance content.
from elasticsearch import Elasticsearch
es = Elasticsearch("http://localhost:9200")
index_name = "insurance-docs"
if not es.indices.exists(index=index_name):
es.indices.create(
index=index_name,
mappings={
"properties": {
"content": {"type": "text"},
"doc_type": {"type": "keyword"},
"policy_id": {"type": "keyword"},
"claim_id": {"type": "keyword"},
}
},
)
print(es.info())
Use explicit fields for insurance metadata. In production, you will filter by policy_id, claim_id, jurisdiction, product line, or effective date.
- •Write documents into Elasticsearch using Haystack documents as the source format.
from elasticsearch import Elasticsearch
from haystack import Document
es = Elasticsearch("http://localhost:9200")
index_name = "insurance-docs"
docs = [
Document(
content="This policy excludes flood damage unless flood cover is endorsed.",
meta={"doc_type": "policy", "policy_id": "POL-1001"}
),
Document(
content="Claim CLM-7782 was denied due to exclusion clause 4.2.",
meta={"doc_type": "claim", "claim_id": "CLM-7782"}
),
]
for doc in docs:
es.index(
index=index_name,
document={
"content": doc.content,
**doc.meta,
},
)
es.indices.refresh(index=index_name)
At this point, Elasticsearch is your source of truth for retrieval. Haystack can sit on top of it to orchestrate retrieval plus generation for agent responses.
- •Build a retrieval pipeline in Haystack that queries Elasticsearch.
from elasticsearch import Elasticsearch
from haystack import Pipeline, Document
es = Elasticsearch("http://localhost:9200")
index_name = "insurance-docs"
def search_elasticsearch(query: str, size: int = 5):
response = es.search(
index=index_name,
query={
"multi_match": {
"query": query,
"fields": ["content", "doc_type", "policy_id", "claim_id"]
}
},
size=size,
)
return [
Document(
content=hit["_source"]["content"],
meta={k: v for k, v in hit["_source"].items() if k != "content"},
score=hit["_score"],
)
for hit in response["hits"]["hits"]
]
query = "flood exclusion"
results = search_elasticsearch(query)
for doc in results:
print(doc.score, doc.meta, doc.content)
This is the core integration pattern. In a real agent system, wrap search_elasticsearch() inside a Haystack tool or custom component so the agent can retrieve relevant policy language before generating an answer.
- •Attach retrieval to an AI agent flow and use the retrieved context to answer questions.
from elasticsearch import Elasticsearch
from haystack import Document
es = Elasticsearch("http://localhost:9200")
index_name = "insurance-docs"
def retrieve_context(question: str) -> str:
response = es.search(
index=index_name,
query={
"multi_match": {
"query": question,
"fields": ["content", "doc_type", "policy_id", "claim_id"]
}
},
size=3,
)
chunks = []
for hit in response["hits"]["hits"]:
src = hit["_source"]
chunks.append(f"[{src.get('doc_type')}] {src.get('content')}")
return "\n".join(chunks)
question = "Does this policy cover flood damage?"
context = retrieve_context(question)
prompt = f"""
Use the context below to answer the insurance question.
Context:
{context}
Question:
{question}
"""
print(prompt)
In production, this prompt would be passed to your LLM node inside a Haystack pipeline or agent executor. The important part is that retrieval happens through Elasticsearch with insurance-specific metadata preserved end-to-end.
Testing the Integration
Run a simple query against a known indexed policy clause and check that the right document comes back.
from elasticsearch import Elasticsearch
es = Elasticsearch("http://localhost:9200")
response = es.search(
index="insurance-docs",
query={
"match": {
"content": {
"query": "flood damage"
}
}
},
size=1,
)
hit = response["hits"]["hits"][0]["_source"]
print(hit["doc_type"])
print(hit["policy_id"])
print(hit["content"])
Expected output:
policy
POL-1001
This policy excludes flood damage unless flood cover is endorsed.
If you get that result back consistently, your indexing and retrieval path is working.
Real-World Use Cases
- •Claims triage agents
- •Retrieve prior claim notes, exclusions, and settlement history before drafting a recommendation.
- •Policy Q&A assistants
- •Answer broker or customer questions using exact policy wording instead of hallucinated summaries.
- •Underwriting copilots
- •Search underwriting guidelines and product rules while drafting risk assessments or referral notes.
Keep learning
- •The complete AI Agents Roadmap — my full 8-step breakdown
- •Free: The AI Agent Starter Kit — PDF checklist + starter code
- •Work with me — I build AI for banks and insurance companies
By Cyprian Aarons, AI Consultant at Topiax.
Want the complete 8-step roadmap?
Grab the free AI Agent Starter Kit — architecture templates, compliance checklists, and a 7-email deep-dive course.
Get the Starter Kit