pgvector vs Elasticsearch for insurance: Which Should You Use?

By Cyprian AaronsUpdated 2026-04-21
pgvectorelasticsearchinsurance

pgvector and Elasticsearch solve different problems, even though both can store vectors and run similarity search. pgvector is a Postgres extension for teams that want vector search inside their transactional database; Elasticsearch is a search engine built for retrieval, filtering, scoring, and operational search at scale.

For insurance, use pgvector first if your workload is embedded in claims, policy, or customer data already living in Postgres. Use Elasticsearch only when search is a first-class product requirement with heavy text retrieval, faceting, and high-volume indexing.

Quick Comparison

AreapgvectorElasticsearch
Learning curveLow if you already know PostgreSQL; use CREATE EXTENSION vector, vector, halfvec, sparsevec, ivfflat, hnswHigher; you need to understand indices, mappings, analyzers, shards, replicas, and relevance tuning
PerformanceStrong for moderate-scale vector search inside Postgres; excellent when combined with SQL filters and joinsStrong for large-scale retrieval and hybrid search; built for distributed indexing and query throughput
EcosystemNative Postgres ecosystem: transactions, joins, RLS, backups, ORM supportMature search ecosystem: full-text search, aggregations, ingest pipelines, Kibana
PricingUsually cheaper if you already run Postgres; one system to operateMore expensive operationally; separate cluster or managed service plus indexing overhead
Best use casesRAG over policy/claims data, deduping documents, semantic lookup with SQL filtersEnterprise search portals, log-like document search, hybrid text + vector retrieval at scale
DocumentationStraightforward extension docs and Postgres examplesBroad but more complex; lots of knobs and tuning guidance

When pgvector Wins

  • Your source of truth is already Postgres If claims, policies, customer records, and document metadata are in PostgreSQL, keep the vector index there. You get one transaction boundary, one backup strategy, one access model.

  • You need strict SQL filtering with semantic search Insurance queries are rarely “just vector similarity.” They are usually “find similar claim notes for this line of business in the last 18 months for this state.” pgvector lets you combine ORDER BY embedding <=> $1 with normal SQL filters cleanly.

  • You want simpler ops and fewer moving parts pgvector adds an extension to an existing database. That means no separate cluster to size, no shard planning, no analyzer tuning just to get started.

  • You care about transactional consistency If a claim gets updated and its embedding needs to stay in sync with the row that owns it, Postgres is the right place. You can update the row and embedding together instead of managing eventual consistency across systems.

Example pattern

CREATE EXTENSION IF NOT EXISTS vector;

CREATE TABLE claim_notes (
  id bigserial PRIMARY KEY,
  claim_id bigint NOT NULL,
  state text NOT NULL,
  created_at timestamptz NOT NULL DEFAULT now(),
  note ტექxt NOT NULL,
  embedding vector(1536)
);

CREATE INDEX ON claim_notes USING hnsw (embedding vector_cosine_ops);

SELECT id, claim_id
FROM claim_notes
WHERE state = 'CA'
  AND created_at >= now() - interval '18 months'
ORDER BY embedding <=> '[...]'::vector
LIMIT 10;

That is the right shape for insurance. The semantic ranking stays close to the business filters.

When Elasticsearch Wins

  • You need real enterprise search If users expect keyword search across policy PDFs, endorsements, emails, notes, and attachments with relevance tuning, Elasticsearch is better. Its inverted index model is still stronger than Postgres for classic text retrieval.

  • You need faceting and aggregations Insurance teams ask questions like “show me claims by carrier, loss type, region, adjuster team, and status.” Elasticsearch’s aggregations are built for this. pgvector is not a replacement for a proper analytics/search engine.

  • You have huge document volumes Once you are indexing millions of documents with frequent updates and multiple query patterns — keyword search plus semantic ranking plus filters — Elasticsearch’s distributed architecture starts earning its keep.

  • You want hybrid retrieval as a primary feature Elasticsearch supports dense vectors via dense_vector, approximate kNN search with HNSW-style indexing depending on version/configuration path, plus BM25-style lexical scoring. That makes it strong when exact terms matter alongside semantic similarity.

Example pattern

PUT claims
{
  "mappings": {
    "properties": {
      "note": { "type": "text" },
      "state": { "type": "keyword" },
      "created_at": { "type": "date" },
      "embedding": {
        "type": "dense_vector",
        "dims": 1536,
        "index": true,
        "similarity": "cosine"
      }
    }
  }
}

Then you can combine structured filters with text relevance and vector scoring in one retrieval layer. That matters when the user experience depends on ranked search results rather than database-style lookup.

For insurance Specifically

My recommendation: start with pgvector unless your product is explicitly a search product. Most insurance AI workloads are RAG over internal records — claims notes, underwriting guidelines, policy language — where SQL filters matter more than fancy search infrastructure.

Use Elasticsearch only if your team needs heavyweight document search across large corpora with faceting, analytics-style aggregation queries, and multi-field relevance tuning. For core insurance systems of record plus AI retrieval on top, pgvector is the cleaner default and the cheaper mistake-proof choice.


Keep learning

By Cyprian Aarons, AI Consultant at Topiax.

Want the complete 8-step roadmap?

Grab the free AI Agent Starter Kit — architecture templates, compliance checklists, and a 7-email deep-dive course.

Get the Starter Kit

Related Guides