How to Integrate Anthropic for healthcare with pgvector for multi-agent systems
Combining Anthropic for healthcare with pgvector gives you a practical pattern for clinical-grade multi-agent systems: one agent can reason over patient context, while another retrieves the most relevant prior notes, care plans, or policy snippets from vector search. That matters when you need grounded responses, traceable retrieval, and shared memory across agents without stuffing everything into the prompt.
Prerequisites
- •Python 3.10+
- •A running PostgreSQL instance
- •
pgvectorinstalled in PostgreSQL - •An Anthropic API key with access to the relevant healthcare-capable model
- •A Postgres user with permissions to create extensions and tables
- •Basic familiarity with embeddings and multi-agent orchestration
Install the Python packages:
pip install anthropic psycopg[binary] pgvector python-dotenv
Enable the extension in Postgres:
CREATE EXTENSION IF NOT EXISTS vector;
Integration Steps
- •
Create a shared vector store for clinical memory
In a multi-agent system, each agent should not keep its own isolated context. Store note chunks, care summaries, and policy text in Postgres so every agent can retrieve the same source of truth.
import os
import psycopg
from pgvector.psycopg import register_vector
DB_URL = os.environ["DATABASE_URL"]
with psycopg.connect(DB_URL) as conn:
register_vector(conn)
with conn.cursor() as cur:
cur.execute("""
CREATE TABLE IF NOT EXISTS clinical_memory (
id SERIAL PRIMARY KEY,
patient_id TEXT NOT NULL,
source TEXT NOT NULL,
content TEXT NOT NULL,
embedding VECTOR(1536)
);
""")
conn.commit()
- •
Generate embeddings with Anthropic-compatible workflow and store them
Anthropic’s API is used here for the reasoning model layer. For embeddings, use your chosen embedding provider and keep the interface clean so the rest of the system stays model-agnostic. In production, this separation is what lets you swap models without rewriting retrieval.
import os
import anthropic
import psycopg
from pgvector.psycopg import register_vector
client = anthropic.Anthropic(api_key=os.environ["ANTHROPIC_API_KEY"])
DB_URL = os.environ["DATABASE_URL"]
def embed_text(text: str) -> list[float]:
# Replace with your embedding provider call.
# Keep output dimension aligned with your VECTOR column.
return [0.01] * 1536
clinical_chunks = [
{
"patient_id": "p-1001",
"source": "discharge_summary",
"content": "Patient discharged on lisinopril 10mg daily. Follow-up in 2 weeks."
},
{
"patient_id": "p-1001",
"source": "triage_note",
"content": "Reports mild shortness of breath on exertion. No chest pain."
}
]
with psycopg.connect(DB_URL) as conn:
register_vector(conn)
with conn.cursor() as cur:
for chunk in clinical_chunks:
emb = embed_text(chunk["content"])
cur.execute(
"""
INSERT INTO clinical_memory (patient_id, source, content, embedding)
VALUES (%s, %s, %s, %s)
""",
(chunk["patient_id"], chunk["source"], chunk["content"], emb),
)
conn.commit()
- •
Retrieve relevant context from pgvector before calling Anthropic
The retriever agent queries similar notes first. Then the reasoning agent gets only the top matches, which keeps prompts small and makes outputs more defensible.
import os
import psycopg
from pgvector.psycopg import register_vector
DB_URL = os.environ["DATABASE_URL"]
def embed_text(text: str) -> list[float]:
return [0.01] * 1536
def retrieve_context(patient_id: str, query: str, k: int = 3):
query_embedding = embed_text(query)
with psycopg.connect(DB_URL) as conn:
register_vector(conn)
with conn.cursor() as cur:
cur.execute(
"""
SELECT source, content
FROM clinical_memory
WHERE patient_id = %s
ORDER BY embedding <-> %s::vector
LIMIT %s;
""",
(patient_id, query_embedding, k),
)
return cur.fetchall()
- •
Call Anthropic with retrieved context to produce a grounded answer
Use
client.messages.create(...)to generate a response that cites only what was retrieved. For healthcare workflows, keep the prompt constrained and ask for structured output.
import os
import anthropic
client = anthropic.Anthropic(api_key=os.environ["ANTHROPIC_API_KEY"])
def generate_clinical_response(patient_id: str, query: str) -> str:
context_rows = retrieve_context(patient_id, query)
context_block = "\n".join(
[f"[{source}] {content}" for source, content in context_rows]
)
message = client.messages.create(
model="claude-3-5-sonnet-latest",
max_tokens=400,
temperature=0,
messages=[
{
"role": "user",
"content": f"""
You are a clinical assistant working from retrieved patient context only.
Patient ID: {patient_id}
Question: {query}
Retrieved context:
{context_block}
Return:
- likely interpretation
- missing data needed before action
- concise next step recommendation
"""
}
],
)
return message.content[0].text
print(generate_clinical_response("p-1001", "What follow-up should be scheduled?"))
- •
Wire it into a multi-agent flow
A practical setup is a router agent plus specialist agents. The router decides whether to retrieve from pgvector, then the clinician-facing agent uses Anthropic to synthesize an answer.
def route_request(query: str) -> str:
if any(term in query.lower() for term in ["follow-up", "medication", "symptom", "note"]):
return "clinical_retrieval"
return "general_reasoning"
def handle_request(patient_id: str, query: str) -> str:
route = route_request(query)
if route == "clinical_retrieval":
return generate_clinical_response(patient_id, query)
response = client.messages.create(
model="claude-3-5-sonnet-latest",
max_tokens=200,
temperature=0,
messages=[{"role": "user", "content": query}],
)
return response.content[0].text
Testing the Integration
Run a simple end-to-end check against one patient record:
result = handle_request("p-1001", "What follow-up should be scheduled after discharge?")
print(result)
Expected output:
likely interpretation:
The patient needs a 2-week outpatient follow-up after discharge.
missing data needed before action:
No appointment date or specialty is listed in the retrieved notes.
concise next step recommendation:
Schedule primary care or discharge follow-up within 2 weeks and confirm medication adherence.
Real-World Use Cases
- •
Clinical chart summarization
- •One agent retrieves prior notes from pgvector.
- •Another agent uses Anthropic to summarize changes across encounters.
- •
Care-gap detection
- •A retrieval agent pulls recent labs, discharge notes, and medication lists.
- •A reasoning agent flags missing follow-ups or unresolved symptoms.
- •
Policy-aware triage assistants
- •Store internal triage protocols in pgvector.
- •Use Anthropic to answer staff questions while grounding responses in approved policy text.
Keep learning
- •The complete AI Agents Roadmap — my full 8-step breakdown
- •Free: The AI Agent Starter Kit — PDF checklist + starter code
- •Work with me — I build AI for banks and insurance companies
By Cyprian Aarons, AI Consultant at Topiax.
Want the complete 8-step roadmap?
Grab the free AI Agent Starter Kit — architecture templates, compliance checklists, and a 7-email deep-dive course.
Get the Starter Kit