AutoGen Tutorial (TypeScript): handling long documents for beginners

By Cyprian AaronsUpdated 2026-04-21
autogenhandling-long-documents-for-beginnerstypescript

This tutorial shows how to take a long document, split it into manageable chunks, and use AutoGen in TypeScript to summarize or extract answers without blowing past model context limits. You need this when a single PDF, policy, contract, or knowledge base article is too large to send to an LLM in one shot.

What You'll Need

  • Node.js 18+
  • A TypeScript project with ts-node or tsx
  • AutoGen for TypeScript: npm install @autogenai/autogen
  • An OpenAI API key set as OPENAI_API_KEY
  • A long text file to test with, for example ./docs/policy.txt
  • Basic familiarity with AutoGen agents and message passing

Step-by-Step

  1. Start by loading the document and splitting it into chunks. For beginner-friendly document handling, keep the chunking simple and deterministic so you can debug it later.
import fs from "node:fs";

export function loadAndChunkDocument(path: string, chunkSize = 4000): string[] {
  const text = fs.readFileSync(path, "utf8");
  const chunks: string[] = [];

  for (let i = 0; i < text.length; i += chunkSize) {
    chunks.push(text.slice(i, i + chunkSize));
  }

  return chunks;
}

const chunks = loadAndChunkDocument("./docs/policy.txt");
console.log(`Loaded ${chunks.length} chunks`);
  1. Create a summarizer agent that processes one chunk at a time. The point here is not to ask the model to “understand everything” at once, but to produce stable intermediate summaries you can combine later.
import { AssistantAgent } from "@autogenai/autogen";

const summarizer = new AssistantAgent({
  name: "summarizer",
  model: "gpt-4o-mini",
  systemMessage:
    "You summarize document chunks for later aggregation. Return concise bullet points only.",
});

async function summarizeChunk(chunk: string): Promise<string> {
  const result = await summarizer.run([
    {
      role: "user",
      content: `Summarize this chunk:\n\n${chunk}`,
    },
  ]);

  return result.messages.at(-1)?.content?.toString() ?? "";
}
  1. Run the summarizer across all chunks and collect the results. For long documents, this map-style pattern is the simplest reliable baseline before you add retrieval or hierarchical summarization.
import { loadAndChunkDocument } from "./chunking.js";

async function summarizeDocument(path: string): Promise<string[]> {
  const chunks = loadAndChunkDocument(path);
  const summaries: string[] = [];

  for (const [index, chunk] of chunks.entries()) {
    console.log(`Summarizing chunk ${index + 1}/${chunks.length}`);
    const summary = await summarizeChunk(chunk);
    summaries.push(summary);
  }

  return summaries;
}

summarizeDocument("./docs/policy.txt").then((summaries) => {
  console.log(summaries.join("\n\n---\n\n"));
});
  1. Add a second pass that merges chunk summaries into one final answer. This is where AutoGen helps you keep the workflow structured: first reduce the document, then answer questions from the reduced representation.
import { AssistantAgent } from "@autogenai/autogen";

const aggregator = new AssistantAgent({
  name: "aggregator",
  model: "gpt-4o-mini",
  systemMessage:
    "You combine multiple partial summaries into one coherent final summary.",
});

async function buildFinalSummary(partialSummaries: string[]): Promise<string> {
  const joined = partialSummaries.map((s, i) => `Chunk ${i + 1}:\n${s}`).join("\n\n");
  const result = await aggregator.run([
    {
      role: "user",
      content: `Combine these chunk summaries into one final summary:\n\n${joined}`,
    },
  ]);

  return result.messages.at(-1)?.content?.toString() ?? "";
}
  1. If you need question answering instead of summarization, ask each chunk only when it might contain relevant information. That keeps token usage predictable and avoids sending irrelevant sections through the model.
async function answerFromChunks(question: string, chunks: string[]): Promise<string> {
  const relevantNotes: string[] = [];

  for (const [index, chunk] of chunks.entries()) {
    const probe = await summarizer.run([
      {
        role: "user",
        content: `Does this chunk contain information relevant to the question "${question}"? Reply yes/no and one short reason.\n\n${chunk}`,
      },
    ]);

    const probeText = probe.messages.at(-1)?.content?.toString() ?? "";
    if (probeText.toLowerCase().includes("yes")) {
      relevantNotes.push(`Chunk ${index + 1}: ${probeText}`);
    }
  }

  const finalAnswer = await aggregator.run([
    {
      role: "user",
      content: `Answer this question using only the relevant notes below.\n\nQuestion: ${question}\n\nNotes:\n${relevantNotes.join("\n\n")}`,
    },
  ]);

  return finalAnswer.messages.at(-1)?.content?.toString() ?? "";
}

Testing It

Run the script against a real long document first, not synthetic lorem ipsum. Check that the number of chunks matches what you expect and that each intermediate summary is short enough to be useful in later steps.

Then compare the final summary against the source document manually for accuracy on key facts like dates, obligations, exceptions, or thresholds. If you are doing Q&A, test with one question whose answer appears in only one section and another question whose answer is spread across multiple sections.

If outputs get vague, reduce chunk size or tighten the system message so each chunk summary stays factual and compact. If outputs miss details, increase overlap between chunks later; for beginners, start without overlap so behavior stays easy to reason about.

Next Steps

  • Add overlapping chunks so cross-section references are less likely to be missed
  • Replace full-document scanning with vector search over embeddings
  • Turn this into a multi-agent workflow where one agent extracts facts and another validates them

Keep learning

By Cyprian Aarons, AI Consultant at Topiax.

Want the complete 8-step roadmap?

Grab the free AI Agent Starter Kit — architecture templates, compliance checklists, and a 7-email deep-dive course.

Get the Starter Kit

Related Guides