LlamaIndex Tutorial (Python): connecting to PostgreSQL for intermediate developers
This tutorial shows you how to connect LlamaIndex to PostgreSQL from Python, store your documents in a Postgres-backed vector store, and query them with a retrieval pipeline. You need this when you want persistence, better operational control, and a database your team already runs in production.
What You'll Need
- •Python 3.10+
- •A running PostgreSQL instance
- •A PostgreSQL user with permission to create tables
- •OpenAI API key set as an environment variable:
- •
OPENAI_API_KEY
- •
- •Python packages:
- •
llama-index - •
llama-index-vector-stores-postgres - •
psycopg2-binary - •
sqlalchemy
- •
- •A local
.envfile or shell environment for secrets
Install the dependencies:
pip install llama-index llama-index-vector-stores-postgres psycopg2-binary sqlalchemy
Step-by-Step
- •Start by setting up your PostgreSQL connection string and loading your OpenAI key from the environment. For production code, keep credentials out of source control and use a secret manager or environment variables.
import os
from dotenv import load_dotenv
load_dotenv()
PGVECTOR_URL = os.getenv(
"PGVECTOR_URL",
"postgresql+psycopg2://postgres:postgres@localhost:5432/llamaindex"
)
OPENAI_API_KEY = os.getenv("OPENAI_API_KEY")
if not OPENAI_API_KEY:
raise ValueError("OPENAI_API_KEY is required")
- •Create a few sample documents and build an index on top of PostgreSQL. This example uses LlamaIndex’s vector store integration so embeddings are persisted in Postgres instead of living only in memory.
from llama_index.core import Document, StorageContext, VectorStoreIndex
from llama_index.embeddings.openai import OpenAIEmbedding
from llama_index.vector_stores.postgres import PGVectorStore
documents = [
Document(text="PostgreSQL is a relational database that supports ACID transactions."),
Document(text="LlamaIndex can store embeddings in PostgreSQL using pgvector."),
Document(text="Vector search helps retrieve semantically similar chunks of text."),
]
vector_store = PGVectorStore.from_params(
database="llamaindex",
host="localhost",
password="postgres",
port=5432,
user="postgres",
table_name="llamaindex_vectors",
embed_dim=1536,
)
storage_context = StorageContext.from_defaults(vector_store=vector_store)
embed_model = OpenAIEmbedding(model="text-embedding-3-small")
index = VectorStoreIndex.from_documents(
documents,
storage_context=storage_context,
embed_model=embed_model,
)
- •Build a query engine and ask a question against the indexed content. The key point here is that retrieval happens through PostgreSQL, while LLM synthesis still happens through the model you configure.
from llama_index.llms.openai import OpenAI
llm = OpenAI(model="gpt-4o-mini", temperature=0)
query_engine = index.as_query_engine(
llm=llm,
similarity_top_k=2,
)
response = query_engine.query("What does PostgreSQL support?")
print(response)
- •If you want this to behave like a real service, separate ingestion from querying. In practice, you ingest once, persist to Postgres, then reconnect later and query without rebuilding the index every time.
from llama_index.core import StorageContext, VectorStoreIndex
from llama_index.vector_stores.postgres import PGVectorStore
vector_store = PGVectorStore.from_params(
database="llamaindex",
host="localhost",
password="postgres",
port=5432,
user="postgres",
table_name="llamaindex_vectors",
embed_dim=1536,
)
storage_context = StorageContext.from_defaults(vector_store=vector_store)
index = VectorStoreIndex.from_vector_store(
vector_store=vector_store,
storage_context=storage_context,
)
query_engine = index.as_query_engine(similarity_top_k=3)
print(query_engine.query("How does LlamaIndex use PostgreSQL?"))
- •Clean up by making sure your table exists and your app can reconnect after restart. If the query works after you stop and start the process again, you have persistence working correctly.
from sqlalchemy import create_engine, text
engine = create_engine(PGVECTOR_URL)
with engine.connect() as conn:
result = conn.execute(text("SELECT COUNT(*) FROM llamaindex_vectors"))
row_count = result.scalar_one()
print(f"Rows in vector table: {row_count}")
Testing It
Run the script once to ingest the documents and confirm it prints a relevant answer. Then restart the Python process and run only the query path to verify the data is still available in PostgreSQL.
If you get an error about missing extensions or unsupported types, check that pgvector is installed in your database instance. If retrieval returns empty results, confirm that the table name, credentials, and embedding dimension match between ingestion and querying.
A good test question is something directly covered by one document, like “What is pgvector used for?” You should see an answer grounded in the stored text rather than a generic response.
Next Steps
- •Add metadata filters so you can scope retrieval by tenant, document type, or business unit.
- •Swap the sample documents for a loader that ingests PDFs, HTML pages, or database records.
- •Add hybrid search patterns if you need both keyword matching and vector similarity in production workloads
Keep learning
- •The complete AI Agents Roadmap — my full 8-step breakdown
- •Free: The AI Agent Starter Kit — PDF checklist + starter code
- •Work with me — I build AI for banks and insurance companies
By Cyprian Aarons, AI Consultant at Topiax.
Want the complete 8-step roadmap?
Grab the free AI Agent Starter Kit — architecture templates, compliance checklists, and a 7-email deep-dive course.
Get the Starter Kit