How to Integrate Anthropic for healthcare with pgvector for production AI
Connecting Anthropic for healthcare with pgvector gives you a practical pattern for retrieval-augmented clinical workflows: keep sensitive patient context in your own vector store, then use Anthropic to reason over only the relevant chunks. That combination is useful for chart summarization, prior-auth assistance, symptom triage, and policy-aware clinical support where you need both strong language understanding and controlled retrieval.
Prerequisites
- •Python 3.10+
- •An Anthropic API key with access to the healthcare-capable model you plan to use
- •PostgreSQL 14+ with the
pgvectorextension installed - •A working Postgres user/database with permissions to create tables and extensions
- •
pippackages:- •
anthropic - •
psycopg[binary] - •
pgvector - •
python-dotenv
- •
- •A document set to index:
- •clinical notes
- •care guidelines
- •payer policy docs
- •internal SOPs
Install the dependencies:
pip install anthropic psycopg[binary] pgvector python-dotenv
Integration Steps
1) Set up PostgreSQL with pgvector
Create the extension and a table that stores embeddings alongside your source text. Use cosine distance for semantic search.
import os
import psycopg
from pgvector.psycopg import register_vector
DB_URL = os.environ["DATABASE_URL"]
with psycopg.connect(DB_URL) as conn:
register_vector(conn)
with conn.cursor() as cur:
cur.execute("CREATE EXTENSION IF NOT EXISTS vector;")
cur.execute("""
CREATE TABLE IF NOT EXISTS clinical_chunks (
id SERIAL PRIMARY KEY,
patient_id TEXT NOT NULL,
source TEXT NOT NULL,
chunk ტექXT NOT NULL,
embedding VECTOR(1536)
);
""")
cur.execute("""
CREATE INDEX IF NOT EXISTS clinical_chunks_embedding_idx
ON clinical_chunks USING ivfflat (embedding vector_cosine_ops)
WITH (lists = 100);
""")
conn.commit()
2) Generate embeddings with Anthropic-compatible text processing
For production, separate embedding generation from generation-time reasoning. If your Anthropic healthcare workflow uses extracted text from notes or documents, chunk it first and send each chunk through your embedding pipeline.
Below is a clean pattern using an embedding provider for vectors and Anthropic for downstream reasoning. If your Anthropic deployment exposes an embeddings endpoint in your environment, swap the client call into the same interface.
import os
from anthropic import Anthropic
client = Anthropic(api_key=os.environ["ANTHROPIC_API_KEY"])
def chunk_text(text: str, size: int = 800):
return [text[i:i+size] for i in range(0, len(text), size)]
def get_embedding(text: str):
# Replace with your approved embedding endpoint/provider.
# Keep this function isolated so the rest of the app does not care.
raise NotImplementedError("Wire in your production embedding model here")
note = """
Patient reports worsening shortness of breath on exertion.
History of CHF, HTN, diabetes. Recent weight gain of 4 lbs in 3 days.
"""
chunks = chunk_text(note)
embeddings = [get_embedding(chunk) for chunk in chunks]
3) Store chunks and vectors in pgvector
Insert each chunk into Postgres. Keep metadata tight: patient ID, document source, timestamp, and any access-control fields you need.
import os
import psycopg
from pgvector.psycopg import register_vector
DB_URL = os.environ["DATABASE_URL"]
rows = [
("patient_123", "admission_note", "Patient reports worsening shortness of breath on exertion.", embeddings[0]),
("patient_123", "admission_note", "History of CHF, HTN, diabetes. Recent weight gain of 4 lbs in 3 days.", embeddings[1]),
]
with psycopg.connect(DB_URL) as conn:
register_vector(conn)
with conn.cursor() as cur:
cur.executemany(
"""
INSERT INTO clinical_chunks (patient_id, source, chunk_text, embedding)
VALUES (%s, %s, %s, %s)
""",
rows,
)
conn.commit()
4) Retrieve relevant context with pgvector and pass it to Anthropic
This is the core integration. Search by similarity first, then give Anthropic only the top matches plus instructions that constrain output style and scope.
import os
import psycopg
from anthropic import Anthropic
from pgvector.psycopg import register_vector
DB_URL = os.environ["DATABASE_URL"]
client = Anthropic(api_key=os.environ["ANTHROPIC_API_KEY"])
def retrieve_context(query_embedding, limit=3):
with psycopg.connect(DB_URL) as conn:
register_vector(conn)
with conn.cursor() as cur:
cur.execute(
"""
SELECT patient_id, source, chunk_text
FROM clinical_chunks
ORDER BY embedding <=> %s::vector
LIMIT %s;
""",
(query_embedding.tolist(), limit),
)
return cur.fetchall()
query = "What is driving the patient's dyspnea?"
query_embedding = get_embedding(query)
matches = retrieve_context(query_embedding)
context_block = "\n\n".join(
f"[{patient_id} | {source}] {chunk_text}"
for patient_id, source, chunk_text in matches
)
response = client.messages.create(
model="claude-3-5-sonnet-latest",
max_tokens=300,
temperature=0,
system="You are a healthcare assistant. Use only the provided context. Do not invent facts.",
messages=[
{
"role": "user",
"content": f"Context:\n{context_block}\n\nQuestion: {query}"
}
],
)
print(response.content[0].text)
5) Add guardrails before production traffic
Do not send raw PHI unless your compliance posture allows it. Redact identifiers where possible, log retrieval IDs instead of full note text, and keep tenant or patient filters in every query.
def safe_retrieve(patient_id: str, query_embedding):
with psycopg.connect(DB_URL) as conn:
register_vector(conn)
with conn.cursor() as cur:
cur.execute(
"""
SELECT source, chunk_text
FROM clinical_chunks
WHERE patient_id = %s
ORDER BY embedding <=> %s::vector
LIMIT 5;
""",
(patient_id, query_embedding.tolist()),
)
return cur.fetchall()
Testing the Integration
Run a smoke test that inserts one known chunk, queries it back semantically, and asks Anthropic to summarize it.
test_query = "What changed in this patient's condition?"
test_embedding = get_embedding(test_query)
results = safe_retrieve("patient_123", test_embedding)
context = "\n".join(f"{source}: {chunk}" for source, chunk in results)
resp = client.messages.create(
model="claude-3-5-sonnet-latest",
max_tokens=120,
temperature=0,
system="Answer only from retrieved clinical context.",
messages=[{"role": "user", "content": f"{context}\n\nQuestion: {test_query}"}],
)
print(resp.content[0].text)
Expected output:
The patient’s dyspnea worsened recently. Supporting details include a history of CHF and a 4 lb weight gain over 3 days.
Real-World Use Cases
- •Clinical chart summarization that pulls only relevant note sections from pgvector before asking Anthropic to draft a concise assessment.
- •Prior authorization assistants that retrieve payer policy snippets and compare them against encounter documentation.
- •Nurse triage copilots that combine recent symptom history with protocol documents to generate structured next-step suggestions.
Keep learning
- •The complete AI Agents Roadmap — my full 8-step breakdown
- •Free: The AI Agent Starter Kit — PDF checklist + starter code
- •Work with me — I build AI for banks and insurance companies
By Cyprian Aarons, AI Consultant at Topiax.
Want the complete 8-step roadmap?
Grab the free AI Agent Starter Kit — architecture templates, compliance checklists, and a 7-email deep-dive course.
Get the Starter Kit