How to Integrate Haystack for insurance with Elasticsearch for production AI
Combining Haystack for insurance with Elasticsearch gives you a practical retrieval layer for production AI agents. In insurance workflows, that usually means fast access to policy docs, claims notes, underwriting rules, and customer correspondence without stuffing everything into the model context.
The pattern is simple: Haystack handles orchestration and retrieval logic, while Elasticsearch gives you indexed, low-latency search over structured and unstructured insurance data. That combination is what you want when an agent needs to answer questions with traceable evidence instead of guessing.
Prerequisites
- •Python 3.10+
- •An Elasticsearch cluster running locally or in your environment
- •A Haystack installation compatible with your project
- •Network access from your app to Elasticsearch
- •Insurance documents ready to index:
- •policy PDFs
- •claims summaries
- •underwriting guidelines
- •FAQ or knowledge base articles
- •Environment variables set for credentials if your cluster is secured
Install the dependencies:
pip install haystack-ai elasticsearch
If you are using Haystack’s Elasticsearch integration package in your stack, make sure the connector package is installed as well.
Integration Steps
1) Start by connecting to Elasticsearch
Use the official Python client first. This verifies connectivity before you wire it into Haystack.
from elasticsearch import Elasticsearch
es = Elasticsearch(
"http://localhost:9200",
basic_auth=("elastic", "changeme")
)
print(es.info())
For production, point this at your managed cluster and use API keys instead of hardcoded passwords.
2) Create an index for insurance documents
Use a dedicated index so your retrieval layer stays isolated from operational data.
index_name = "insurance-docs"
if not es.indices.exists(index=index_name):
es.indices.create(
index=index_name,
mappings={
"properties": {
"content": {"type": "text"},
"title": {"type": "text"},
"doc_type": {"type": "keyword"},
"policy_id": {"type": "keyword"},
"embedding": {"type": "dense_vector", "dims": 384}
}
}
)
print(f"Index ready: {index_name}")
If you plan to use vector retrieval, make sure the embedding dimension matches the model you choose.
3) Index insurance content through Haystack documents
Haystack works best when your content is represented as Document objects. In production, this is where you normalize claims notes, policy text, and underwriting guidance before indexing.
from haystack import Document
documents = [
Document(
content="Coverage applies when water damage results from sudden pipe burst.",
meta={"title": "Home Policy Water Damage", "doc_type": "policy", "policy_id": "HP-1001"}
),
Document(
content="Claims above $10,000 require supervisor approval before settlement.",
meta={"title": "Claims Approval Rule", "doc_type": "guideline", "policy_id": "CLM-OPS"}
)
]
for doc in documents:
es.index(
index=index_name,
document={
"content": doc.content,
**doc.meta
}
)
es.indices.refresh(index=index_name)
print("Documents indexed")
This keeps the source of truth in Elasticsearch while letting Haystack manage downstream retrieval and agent reasoning.
4) Build a Haystack retriever over Elasticsearch
In Haystack pipelines, use an Elasticsearch-backed retriever component so queries route into your index. The exact class name depends on the Haystack version you deploy, but the pattern is consistent: configure the retriever with your Elasticsearch connection and index name.
from haystack import Pipeline
from haystack.components.builders import PromptBuilder
# Example pattern for an Elasticsearch-backed retriever in Haystack
from haystack_integrations.components.retrievers.elasticsearch import ElasticsearchBM25Retriever
retriever = ElasticsearchBM25Retriever(
client=es,
index=index_name,
top_k=3
)
template = """
Answer the question using only the retrieved documents.
Question: {{question}}
Documents:
{% for doc in documents %}
- {{ doc.content }} ({{ doc.meta.title }})
{% endfor %}
Answer:
"""
prompt_builder = PromptBuilder(template=template)
pipe = Pipeline()
pipe.add_component("retriever", retriever)
pipe.add_component("prompt_builder", prompt_builder)
pipe.connect("retriever.documents", "prompt_builder.documents")
If you are using embeddings instead of BM25, swap in a vector retriever and keep the rest of the pipeline unchanged.
5) Run a query end-to-end
Now test the full path from user question to retrieved evidence.
result = pipe.run(
{
"retriever": {"query": "When does water damage coverage apply?"},
"prompt_builder": {"question": "When does water damage coverage apply?"}
}
)
print(result["prompt_builder"]["prompt"])
At this point your agent can feed the generated prompt into an LLM response step, or return citations directly to the user.
Testing the Integration
Use a simple smoke test that checks both indexing and retrieval.
query = {
"retriever": {"query": "What approval is needed for claims above $10,000?"},
"prompt_builder": {"question": "What approval is needed for claims above $10,000?"}
}
result = pipe.run(query)
output = result["prompt_builder"]["prompt"]
print(output)
assert "supervisor approval" in output.lower()
Expected output:
Answer the question using only the retrieved documents.
Question: What approval is needed for claims above $10,000?
Documents:
- Claims Approval Rule (Claims Approval Rule)
Answer:
If that assertion passes, your Haystack-to-Elasticsearch path is working and returning relevant insurance content.
Real-World Use Cases
- •Claims triage assistant
- •Retrieve policy language, prior claims notes, and handling rules before drafting a response.
- •Underwriting copilot
- •Search historical submissions and guideline docs to flag missing information or rule conflicts.
- •Customer service agent
- •Answer coverage questions with citations from approved policy documents instead of free-form model output.
The production pattern here is stable: store canonical insurance content in Elasticsearch, retrieve it through Haystack, then let your agent reason over grounded context. That gives you speed, auditability, and enough control to ship something a compliance team will actually sign off on.
Keep learning
- •The complete AI Agents Roadmap — my full 8-step breakdown
- •Free: The AI Agent Starter Kit — PDF checklist + starter code
- •Work with me — I build AI for banks and insurance companies
By Cyprian Aarons, AI Consultant at Topiax.
Want the complete 8-step roadmap?
Grab the free AI Agent Starter Kit — architecture templates, compliance checklists, and a 7-email deep-dive course.
Get the Starter Kit