LlamaIndex Tutorial (Python): connecting to PostgreSQL for beginners

By Cyprian AaronsUpdated 2026-04-21

llamaindexconnecting-to-postgresql-for-beginnerspython

This tutorial shows you how to connect LlamaIndex to PostgreSQL from Python, then use that database as a queryable data source for your agent or RAG app. You’d use this when your structured business data already lives in Postgres and you want LlamaIndex to answer questions over it without exporting CSVs or building a separate search layer.

What You'll Need

•Python 3.10+
•A running PostgreSQL instance
•A PostgreSQL database, user, and password
•pip for installing packages
•An OpenAI API key if you want natural-language querying with an LLM
•
These Python packages:
- •llama-index
- •llama-index-llms-openai
- •sqlalchemy
- •psycopg2-binary
- •python-dotenv

Step-by-Step

•Install the dependencies and set your environment variables.
Keep secrets out of code and load them from .env so the same script works locally and in CI.

pip install llama-index llama-index-llms-openai sqlalchemy psycopg2-binary python-dotenv

OPENAI_API_KEY=your_openai_api_key
POSTGRES_HOST=localhost
POSTGRES_PORT=5432
POSTGRES_DB=company_db
POSTGRES_USER=postgres
POSTGRES_PASSWORD=postgres

•Create a small PostgreSQL table with sample data.
LlamaIndex can query any SQL table, but it helps to start with something concrete like customers and balances.

from sqlalchemy import create_engine, text

engine = create_engine(
    "postgresql+psycopg2://postgres:postgres@localhost:5432/company_db"
)

with engine.begin() as conn:
    conn.execute(text("DROP TABLE IF EXISTS customers"))
    conn.execute(text("""
        CREATE TABLE customers (
            id SERIAL PRIMARY KEY,
            full_name TEXT NOT NULL,
            email TEXT NOT NULL,
            balance NUMERIC(12,2) NOT NULL
        )
    """))
    conn.execute(text("""
        INSERT INTO customers (full_name, email, balance) VALUES
        ('Ada Lovelace', 'ada@example.com', 1200.50),
        ('Grace Hopper', 'grace@example.com', 980.25),
        ('Alan Turing', 'alan@example.com', 150.00)
    """))

•Build the LlamaIndex SQL database wrapper around PostgreSQL.
This is the bridge between LlamaIndex and your Postgres schema.

import os
from dotenv import load_dotenv
from sqlalchemy import create_engine

from llama_index.core import SQLDatabase

load_dotenv()

db_url = (
    f"postgresql+psycopg2://{os.environ['POSTGRES_USER']}:"
    f"{os.environ['POSTGRES_PASSWORD']}@"
    f"{os.environ['POSTGRES_HOST']}:"
    f"{os.environ['POSTGRES_PORT']}/"
    f"{os.environ['POSTGRES_DB']}"
)

engine = create_engine(db_url)
sql_database = SQLDatabase(engine, include_tables=["customers"])

•Create a query engine that can translate English questions into SQL.
For beginners, this is the most useful entry point because you can ask questions without writing SQL manually every time.

import os

from dotenv import load_dotenv
from llama_index.core import Settings, SQLDatabase, NLSQLTableQueryEngine
from llama_index.llms.openai import OpenAI

load_dotenv()

Settings.llm = OpenAI(model="gpt-4o-mini", api_key=os.environ["OPENAI_API_KEY"])

query_engine = NLSQLTableQueryEngine(
    sql_database=sql_database,
    tables=["customers"],
)

•Ask a question and inspect the response.
The engine will generate SQL under the hood, run it against PostgreSQL, and return a natural-language answer.

response = query_engine.query("Who has the highest balance?")
print(str(response))

•If you want to verify the generated SQL during debugging, inspect the source nodes.
This is useful when tuning prompts or checking whether the model queried the right table and columns.

response = query_engine.query("List all customers with balances over 500.")

print("Answer:")
print(response)

print("\nSource text:")
for node in response.source_nodes:
    print(node.node.get_content())

Testing It

Run the script end to end and confirm three things: PostgreSQL starts cleanly, the table gets created with rows inserted, and LlamaIndex returns an answer instead of an exception. If you get an authentication error, check your POSTGRES_* values first; if you get an API error, confirm OPENAI_API_KEY is loaded correctly.

A good test query is something simple like “Who has the highest balance?” because it forces an aggregate plus ordering. You should see a result that points to Ada Lovelace with 1200.50.

If you want extra confidence, connect directly with psql or any SQL client and run:

SELECT full_name, balance FROM customers ORDER BY balance DESC LIMIT 1;

The result should match what LlamaIndex returns.

Next Steps

•Add more tables and use include_tables to control which ones LlamaIndex can access.
•Learn how to add row-level filters so agents only see approved customer data.
•Move from raw SQL querying to tool-based agents when you need multi-step workflows across Postgres and other systems.

Keep learning

•The complete AI Agents Roadmap — my full 8-step breakdown
•Free: The AI Agent Starter Kit — PDF checklist + starter code
•Work with me — I build AI for banks and insurance companies

By Cyprian Aarons, AI Consultant at Topiax.

ShareX / Twitter LinkedIn

Want the complete 8-step roadmap?

Grab the free AI Agent Starter Kit — architecture templates, compliance checklists, and a 7-email deep-dive course.

Get the Starter Kit