pgvector vs LangSmith for multi-agent systems: Which Should You Use?
pgvector and LangSmith solve different problems, and treating them as substitutes is how teams waste weeks. pgvector is a PostgreSQL extension for storing and querying embeddings with vector, ivfflat, and hnsw; LangSmith is an observability and evaluation platform for LLM apps with tracing, datasets, experiments, and prompt/version tracking. For multi-agent systems, use LangSmith to debug and evaluate the agents, and pgvector only if you need persistent semantic retrieval inside the workflow.
Quick Comparison
| Category | pgvector | LangSmith |
|---|---|---|
| Learning curve | Low if you already know SQL and Postgres. You install an extension, create a vector column, and query with operators like <-> or <=>. | Moderate. You need to understand tracing concepts, runs, datasets, evals, and how your agent framework emits spans. |
| Performance | Strong for retrieval inside Postgres. Good enough for most RAG workloads with hnsw or ivfflat, especially when data already lives in the database. | Not a vector database. Performance is about trace ingestion, inspection, dataset runs, and evaluation throughput rather than similarity search latency. |
| Ecosystem | Native fit for PostgreSQL stacks. Works well with Supabase, Rails, Django, FastAPI, and any app already using SQL. | Native fit for LangChain, LangGraph, OpenAI SDK workflows, custom agent orchestrators, and evaluation pipelines. |
| Pricing | Open source extension; your cost is Postgres infra and operational overhead. Self-hosting is straightforward if you run Postgres already. | SaaS pricing for hosted observability/evals; value comes from debugging time saved and team visibility. Self-hosting is not the default path. |
| Best use cases | Embedding storage, semantic search, hybrid SQL + vector filtering, long-term memory in a relational system. | Tracing multi-agent execution paths, comparing prompts/models/tool calls, regression testing agents with datasets. |
| Documentation | Solid Postgres-style docs with clear SQL examples: CREATE EXTENSION vector;, indexes, distance operators, tuning guidance. | Good product docs focused on tracing/evals/workflows; strongest when you follow their agent instrumentation patterns closely. |
When pgvector Wins
Use pgvector when your problem is retrieval first and agent orchestration second.
- •
You already run PostgreSQL as the system of record
- •If customer profiles, tickets, policies, claims, or documents are already in Postgres, keep embeddings there too.
- •You get transactional consistency between structured fields and vectors without adding another datastore.
- •
You need hybrid filtering plus semantic search
- •This is where pgvector earns its keep.
- •Example: find similar claim notes where
tenant_id = ?,status = 'open', and embedding distance is below a threshold. - •That kind of query is trivial in SQL and painful in a separate vector service.
- •
You want cheap persistent memory for agents
- •Multi-agent systems often need shared memory: prior decisions, resolved tasks, conversation summaries.
- •Storing those summaries in a
messagesormemoriestable with avectorcolumn keeps memory durable and queryable.
- •
Your team wants one operational surface
- •One backup strategy.
- •One access-control model.
- •One database to monitor.
- •For regulated environments like banking and insurance, fewer moving parts usually wins.
Example schema:
CREATE EXTENSION IF NOT EXISTS vector;
CREATE TABLE agent_memory (
id bigserial PRIMARY KEY,
tenant_id uuid NOT NULL,
agent_name text NOT NULL,
content text NOT NULL,
embedding vector(1536),
created_at timestamptz DEFAULT now()
);
CREATE INDEX ON agent_memory USING hnsw (embedding vector_cosine_ops);
That gives you searchable memory without introducing a separate retrieval stack.
When LangSmith Wins
Use LangSmith when the hard problem is understanding what your agents are doing.
- •
You have multiple agents calling tools in loops
- •This is exactly where traces matter.
- •LangSmith shows each run: model call, tool call, intermediate output, retries, latency, token usage.
- •
You need to compare prompt or model changes
- •Multi-agent systems regress fast.
- •With LangSmith datasets and experiments you can run the same tasks across versions and see which agent behavior changed.
- •
You are shipping to production with debugging requirements
- •When an underwriting assistant makes the wrong tool call or a claims triage agent drops context across steps, you need trace-level visibility.
- •LangSmith gives you that visibility without building your own telemetry pipeline.
- •
You want evaluation discipline
- •Agent systems fail silently unless you measure them.
- •LangSmith lets you define test cases over datasets and score outputs consistently instead of relying on ad hoc manual review.
Typical instrumentation pattern:
from langsmith import traceable
@traceable(name="triage_agent")
def triage_agent(input_text: str):
# call model
# call tools
# return structured decision
...
That trace becomes useful the first time an agent chain breaks under real traffic.
For multi-agent systems Specifically
My recommendation: pick LangSmith first if you are building multi-agent systems. Multi-agent failures are mostly observability failures — bad routing decisions, tool misuse, bad handoffs between agents — and pgvector does nothing to help you inspect that behavior.
Use pgvector alongside it only when the agents need durable semantic memory or retrieval over internal knowledge. In practice: LangSmith for control plane visibility; pgvector for data plane retrieval.
Keep learning
- •The complete AI Agents Roadmap — my full 8-step breakdown
- •Free: The AI Agent Starter Kit — PDF checklist + starter code
- •Work with me — I build AI for banks and insurance companies
By Cyprian Aarons, AI Consultant at Topiax.
Want the complete 8-step roadmap?
Grab the free AI Agent Starter Kit — architecture templates, compliance checklists, and a 7-email deep-dive course.
Get the Starter Kit