LlamaIndex Tutorial (TypeScript): connecting to PostgreSQL for intermediate developers

By Cyprian AaronsUpdated 2026-04-21
llamaindexconnecting-to-postgresql-for-intermediate-developerstypescript

This tutorial shows you how to wire LlamaIndex TypeScript up to PostgreSQL so you can store and query your documents from a real database instead of keeping everything in memory. You’d use this when your app needs persistence, larger datasets, or a cleaner path from prototype to production.

What You'll Need

  • Node.js 18+
  • PostgreSQL 14+
  • A running PostgreSQL database with a connection string
  • An OpenAI API key
  • These npm packages:
    • llamaindex
    • pg
    • dotenv
  • A TypeScript project with tsconfig.json already set up

Install the packages:

npm install llamaindex pg dotenv
npm install -D typescript ts-node @types/node

Set your environment variables:

OPENAI_API_KEY=your_openai_key
DATABASE_URL=postgresql://postgres:postgres@localhost:5432/llamaindex_demo

Step-by-Step

  1. First, create a small PostgreSQL table to hold your documents. For this tutorial, we’ll keep it simple and store text plus metadata in a single table.
import pg from "pg";

const { Client } = pg;

async function main() {
  const client = new Client({
    connectionString: process.env.DATABASE_URL,
  });

  await client.connect();

  await client.query(`
    CREATE TABLE IF NOT EXISTS documents (
      id SERIAL PRIMARY KEY,
      content TEXT NOT NULL,
      metadata JSONB DEFAULT '{}'::jsonb,
      created_at TIMESTAMPTZ DEFAULT NOW()
    );
  `);

  await client.end();
}

main().catch(console.error);
  1. Next, load some sample data into PostgreSQL. In production, this is where you’d write from your ingestion pipeline after parsing PDFs, tickets, or policy docs.
import "dotenv/config";
import pg from "pg";

const { Client } = pg;

async function main() {
  const client = new Client({
    connectionString: process.env.DATABASE_URL,
  });

  await client.connect();

  await client.query("DELETE FROM documents;");

  await client.query(
    `
    INSERT INTO documents (content, metadata)
    VALUES
      ($1, $2),
      ($3, $4),
      ($5, $6)
    `,
    [
      "LlamaIndex can index structured and unstructured data.",
      { source: "docs", topic: "llamaindex" },
      "PostgreSQL is useful for persistence and filtering.",
      { source: "internal", topic: "postgres" },
      "TypeScript gives you type safety across the ingestion pipeline.",
      { source: "internal", topic: "typescript" },
    ]
  );

  await client.end();
}

main().catch(console.error);
  1. Now read the rows back out of PostgreSQL and convert them into LlamaIndex Document objects. This is the bridge between your database and the index.
import "dotenv/config";
import pg from "pg";
import { Document } from "llamaindex";

const { Client } = pg;

async function loadDocuments() {
  const client = new Client({
    connectionString: process.env.DATABASE_URL,
  });

  await client.connect();

  const result = await client.query<{
    id: number;
    content: string;
    metadata: Record<string, unknown>;
  }>("SELECT id, content, metadata FROM documents ORDER BY id ASC");

  const docs = result.rows.map(
    (row) =>
      new Document({
        text: row.content,
        metadata: {
          dbId: row.id,
          ...row.metadata,
        },
      })
  );

  await client.end();
  return docs;
}

loadDocuments().then((docs) => {
  console.log(`Loaded ${docs.length} documents`);
});
  1. Build an index from those documents and ask a question against it. This uses the standard LlamaIndex TypeScript API, so you can swap in more advanced retrieval later without changing the ingestion layer.
import "dotenv/config";
import { Document, VectorStoreIndex, Settings, OpenAI } from "llamaindex";
import pg from "pg";

const { Client } = pg;

Settings.llm = new OpenAI({
  model: "gpt-4o-mini",
});

async function loadDocuments() {
  const client = new Client({
    connectionString: process.env.DATABASE_URL,
  });

  await client.connect();
  const result = await client.query("SELECT id, content, metadata FROM documents ORDER BY id ASC");
  const docs = result.rows.map(
    (row) =>
      new Document({
        text: row.content,
        metadata: { dbId: row.id, ...row.metadata },
      })
  );
  await client.end();
  return docs;
}

async function main() {
  const docs = await loadDocuments();
  const index = await VectorStoreIndex.fromDocuments(docs);
  const queryEngine = index.asQueryEngine();

  const response = await queryEngine.query({
    query: "Which document talks about persistence?",
  });

  console.log(response.toString());
}

main().catch(console.error);
  1. If you want PostgreSQL to do more than just store source data, add filtering before indexing. This is useful when you only want to build an index over a subset of records, like one customer, one tenant, or one document type.
import "dotenv/config";
import pg from "pg";
import { Document } from "llamaindex";

const { Client } = pg;

async function main() {
  const client = new Client({
    connectionString: process.env.DATABASE_URL,
  });

  await client.connect();

  const result = await client.query(
    `
    SELECT id, content, metadata
    FROM documents
    WHERE metadata->>'topic' IN ('postgres', 'llamaindex')
    ORDER BY id ASC
    `
  );

  const docs = result.rows.map(
    (row) =>
      new Document({
        text: row.content,
        metadata: { dbId: row.id, ...row.metadata },
      })
  );

  console.log(`Filtered down to ${docs.length} documents`);
  
await client.end();
}

main().catch(console.error);

Testing It

Run the table creation script first, then insert the sample rows. After that, run the indexing script and make sure it prints a natural-language answer instead of an error.

If you get an empty response or no relevant hits, check three things first:

  • DATABASE_URL points to the right database
  • OPENAI_API_KEY is set correctly
  • Your rows actually exist in the documents table

A good sanity check is to query something obvious like “Which document mentions PostgreSQL?” and confirm the answer references the right row content. If that works, your ingestion path from PostgreSQL into LlamaIndex is wired correctly.

Next Steps

  • Replace the simple documents table with a schema that stores chunked text plus embeddings.
  • Add tenant-aware filtering using PostgreSQL metadata columns or JSONB fields.
  • Move from in-memory indexes to a persistent vector store when your dataset grows beyond a few thousand chunks.

Keep learning

By Cyprian Aarons, AI Consultant at Topiax.

Want the complete 8-step roadmap?

Grab the free AI Agent Starter Kit — architecture templates, compliance checklists, and a 7-email deep-dive course.

Get the Starter Kit

Related Guides