AutoGen Tutorial (Python): chunking large documents for intermediate developers

By Cyprian AaronsUpdated 2026-04-21

autogenchunking-large-documents-for-intermediate-developerspython

This tutorial shows you how to split a large document into manageable chunks, summarize each chunk with AutoGen, and then combine those summaries into a final result. You need this when the model context window is too small for the full document, or when you want more stable extraction from long policy docs, contracts, or reports.

What You'll Need

•Python 3.10+
•pyautogen installed
•An OpenAI-compatible API key
•A text file or long string to process
•Basic familiarity with AutoGen agents and AssistantAgent

Step-by-Step

•Start by installing AutoGen and setting your API key. I’m using the OpenAI-compatible client config because it works cleanly with current AutoGen setups.

pip install pyautogen
export OPENAI_API_KEY="your-key-here"

•Load a long document and split it into chunks. For production, chunk by paragraph boundaries first, then cap by approximate character count so you don’t break sentences mid-way unless you have to.

from pathlib import Path

def chunk_text(text: str, max_chars: int = 4000):
    paragraphs = [p.strip() for p in text.split("\n\n") if p.strip()]
    chunks = []
    current = []

    for para in paragraphs:
        candidate = "\n\n".join(current + [para])
        if len(candidate) <= max_chars:
            current.append(para)
        else:
            if current:
                chunks.append("\n\n".join(current))
            current = [para]

    if current:
        chunks.append("\n\n".join(current))

    return chunks

document = Path("large_document.txt").read_text(encoding="utf-8")
chunks = chunk_text(document, max_chars=4000)
print(f"Loaded {len(chunks)} chunks")

•Create an AutoGen assistant that summarizes each chunk consistently. Keep the prompt strict so every chunk returns structured output you can merge later.

import os
from autogen import AssistantAgent

llm_config = {
    "config_list": [
        {
            "model": "gpt-4o-mini",
            "api_key": os.environ["OPENAI_API_KEY"],
        }
    ],
    "temperature": 0,
}

summarizer = AssistantAgent(
    name="summarizer",
    llm_config=llm_config,
    system_message=(
        "You summarize document chunks for downstream merging. "
        "Return concise bullet points covering facts, obligations, dates, risks, and named entities."
    ),
)

•Summarize each chunk one by one. In real systems this is where you’d add retries, logging, and rate-limit handling, but the core pattern stays the same.

def summarize_chunk(agent: AssistantAgent, chunk: str) -> str:
    message = (
        "Summarize the following document chunk.\n"
        "Use bullets only.\n\n"
        f"{chunk}"
    )
    response = agent.generate_reply(messages=[{"role": "user", "content": message}])
    return response if isinstance(response, str) else response["content"]

chunk_summaries = []
for i, chunk in enumerate(chunks, start=1):
    summary = summarize_chunk(summarizer, chunk)
    chunk_summaries.append(f"Chunk {i}:\n{summary}")
    print(f"Summarized chunk {i}/{len(chunks)}")

•Merge the chunk summaries into a final answer with a second pass. This is the part people skip, but it’s what turns a pile of local summaries into something useful.

merger = AssistantAgent(
    name="merger",
    llm_config=llm_config,
    system_message=(
        "You merge multiple chunk summaries into one coherent final summary. "
        "Remove duplicates, preserve important specifics, and group related points."
    ),
)

merged_input = "\n\n".join(chunk_summaries)
final_summary = merger.generate_reply(
    messages=[
        {
            "role": "user",
            "content": (
                "Merge these chunk summaries into a single executive summary.\n"
                "Keep it structured with headings:\n"
                "- Overview\n- Key Facts\n- Risks / Issues\n- Open Questions\n\n"
                f"{merged_input}"
            ),
        }
    ]
)

print(final_summary if isinstance(final_summary, str) else final_summary["content"])

•If you need better quality on very long inputs, add overlap between adjacent chunks. That helps preserve context across section boundaries where details often get split.

def overlapping_chunks(text: str, max_chars: int = 4000, overlap_chars: int = 500):
    paragraphs = [p.strip() for p in text.split("\n\n") if p.strip()]
    chunks = []
    current = ""

    for para in paragraphs:
        candidate = (current + "\n\n" + para).strip() if current else para
        if len(candidate) <= max_chars:
            current = candidate
        else:
            if current:
                chunks.append(current)
                current = current[-overlap_chars:] + "\n\n" + para
            else:
                chunks.append(para[:max_chars])
                current = ""

    if current:
        chunks.append(current)

    return chunks

Testing It

Run the script against a real document that is longer than your model’s context window. A good test file is a policy PDF converted to text or a multi-page internal report with repeated references across sections.

Check three things:

•Every chunk produces a non-empty summary
•The merged output removes duplicate points instead of repeating them
•Important details like dates, obligations, and exceptions survive the merge

If the final summary feels vague, reduce chunk size or increase overlap slightly. If it misses cross-section references, your boundaries are probably too aggressive.

Next Steps

•Add JSON schema-style output from each summarization pass so merging becomes deterministic
•Replace sequential processing with asyncio or worker pools for throughput on large corpora
•Add retrieval so you only summarize chunks relevant to a user query instead of every page

Keep learning

•The complete AI Agents Roadmap — my full 8-step breakdown
•Free: The AI Agent Starter Kit — PDF checklist + starter code
•Work with me — I build AI for banks and insurance companies

By Cyprian Aarons, AI Consultant at Topiax.

ShareX / Twitter LinkedIn

Want the complete 8-step roadmap?

Grab the free AI Agent Starter Kit — architecture templates, compliance checklists, and a 7-email deep-dive course.

Get the Starter Kit