LlamaIndex Tutorial (TypeScript): handling long documents for intermediate developers

By Cyprian AaronsUpdated 2026-04-21

llamaindexhandling-long-documents-for-intermediate-developerstypescript

This tutorial shows you how to ingest long documents in TypeScript with LlamaIndex, split them into retrieval-friendly chunks, and query them without blowing past model context limits. You need this when your source files are too large for a single prompt, but you still want accurate answers grounded in the full document.

What You'll Need

•Node.js 18+ and npm
•A TypeScript project initialized with tsconfig.json
•
Packages:
- •llamaindex
- •dotenv
- •typescript
- •tsx or ts-node for running scripts
•An OpenAI API key set as OPENAI_API_KEY
•A long text document, PDF text export, or .txt file to test against

Step-by-Step

•Start by installing the packages and setting up your environment. For long-document workflows, you want both indexing and retrieval available from the same SDK.

npm install llamaindex dotenv
npm install -D typescript tsx @types/node

•Create a small loader script that reads a long local file and turns it into LlamaIndex documents. In production, this is where you’d swap in PDF parsing, SharePoint export handling, or database text extraction.

import "dotenv/config";
import fs from "node:fs";
import path from "node:path";
import { Document } from "llamaindex";

const filePath = path.join(process.cwd(), "data", "long-document.txt");
const rawText = fs.readFileSync(filePath, "utf-8");

const doc = new Document({
  text: rawText,
  metadata: {
    source: "long-document.txt",
    type: "internal-policy",
  },
});

console.log(`Loaded document with ${doc.text.length} characters`);

•Build an index with chunking tuned for long documents. The important part is not just indexing the whole file, but splitting it into chunks that preserve enough context for retrieval while staying small enough for embeddings and generation.

import "dotenv/config";
import fs from "node:fs";
import path from "node:path";
import {
  Document,
  Settings,
  VectorStoreIndex,
} from "llamaindex";

Settings.chunkSize = 1024;
Settings.chunkOverlap = 150;

const filePath = path.join(process.cwd(), "data", "long-document.txt");
const rawText = fs.readFileSync(filePath, "utf-8");

const document = new Document({
  text: rawText,
  metadata: { source: "long-document.txt" },
});

const index = await VectorStoreIndex.fromDocuments([document]);
console.log("Index built");

•Query the index with a retriever-backed engine instead of sending the whole document to the model. This keeps prompts small and gives you answers grounded in the most relevant chunks.

import "dotenv/config";
import fs from "node:fs";
import path from "node:path";
import {
  Document,
  Settings,
  VectorStoreIndex,
} from "llamaindex";

Settings.chunkSize = 1024;
Settings.chunkOverlap = 150;

const filePath = path.join(process.cwd(), "data", "long-document.txt");
const rawText = fs.readFileSync(filePath, "utf-8");

const document = new Document({ text: rawText });
const index = await VectorStoreIndex.fromDocuments([document]);

const queryEngine = index.asQueryEngine({
  similarityTopK: 3,
});

const response = await queryEngine.query({
  query: "What are the key responsibilities described in this document?",
});

console.log(String(response));

•For better control on very large documents, inspect retrieved nodes before answering. This is useful when legal, compliance, or claims teams need traceability back to exact source passages.

import "dotenv/config";
import fs from "node:fs";
import path from "node:path";
import {
  Document,
  Settings,
  VectorStoreIndex,
} from "llamaindex";

Settings.chunkSize = 1024;
Settings.chunkOverlap = 150;

const filePath = path.join(process.cwd(), "data", "long-document.txt");
const rawText = fs.readFileSync(filePath, "utf-8");

const document = new Document({ text: rawText });
const index = await VectorStoreIndex.fromDocuments([document]);

const retriever = index.asRetriever({ similarityTopK: 3 });
const nodes = await retriever.retrieve("What deadlines are mentioned?");

for (const node of nodes) {
  console.log("SCORE:", node.score);
  console.log("TEXT:", node.node.getContent().slice(0, 400));
  console.log("---");
}

Testing It

Run the script against a document that is clearly longer than a single model prompt window, such as a policy manual or contract export. Ask questions that should only be answerable by retrieving specific sections, not by summarizing the whole file.

Check that the output references details from different parts of the document rather than hallucinating broad summaries. If you see irrelevant answers, reduce chunkSize, increase chunkOverlap, or raise similarityTopK to retrieve more context.

A good sanity check is to ask for a section-specific fact like dates, responsibilities, exclusions, or thresholds. If those answers are correct and repeatable across runs, your long-document pipeline is working.

Next Steps

•Add metadata filters so you can query only specific departments, policy versions, or claim types
•Swap in a PDF or DOCX loader and keep the same indexing pipeline
•Explore response synthesizers and rerankers for higher precision on dense enterprise documents

Keep learning

•The complete AI Agents Roadmap — my full 8-step breakdown
•Free: The AI Agent Starter Kit — PDF checklist + starter code
•Work with me — I build AI for banks and insurance companies

By Cyprian Aarons, AI Consultant at Topiax.

ShareX / Twitter LinkedIn

Want the complete 8-step roadmap?

Grab the free AI Agent Starter Kit — architecture templates, compliance checklists, and a 7-email deep-dive course.

Get the Starter Kit