LangGraph Tutorial (Python): handling long documents for advanced developers

By Cyprian AaronsUpdated 2026-04-22
langgraphhandling-long-documents-for-advanced-developerspython

This tutorial builds a LangGraph pipeline that can ingest long documents, split them into chunks, process them in parallel, and merge the results into a single structured output. You need this when a single prompt can’t reliably fit the whole document, or when you want deterministic control over chunking, retries, and aggregation.

What You'll Need

  • Python 3.10+
  • langgraph
  • langchain-core
  • langchain-openai
  • OpenAI API key set as OPENAI_API_KEY
  • A long text document to test with
  • Basic familiarity with LangGraph nodes, edges, and state

Install the packages:

pip install langgraph langchain-core langchain-openai

Step-by-Step

  1. Start by defining a graph state that tracks the raw document, chunk list, per-chunk summaries, and the final answer. For long-document workflows, explicit state is what keeps the pipeline debuggable.
from typing import TypedDict, Annotated
import operator

class DocState(TypedDict):
    document: str
    chunks: list[str]
    summaries: Annotated[list[str], operator.add]
    final_summary: str
  1. Next, create a chunking node. This example uses a simple character-based splitter so the code runs as-is without extra dependencies; in production you may swap in token-aware splitting.
def split_document(state: DocState) -> dict:
    text = state["document"]
    chunk_size = 1200
    overlap = 150

    chunks = []
    start = 0
    while start < len(text):
        end = min(start + chunk_size, len(text))
        chunks.append(text[start:end])
        start = end - overlap if end - overlap > start else end

    return {"chunks": chunks}
  1. Add a per-chunk processing node that summarizes each chunk. This version uses an LLM call through LangChain’s OpenAI wrapper, which is the cleanest path for production LangGraph code.
import os
from langchain_openai import ChatOpenAI
from langchain_core.messages import HumanMessage

llm = ChatOpenAI(model="gpt-4o-mini", temperature=0)

def summarize_chunk(state: dict) -> dict:
    chunk = state["chunk"]
    prompt = (
        "Summarize this document chunk for downstream aggregation. "
        "Keep names, dates, obligations, risks, and decisions.\n\n"
        f"{chunk}"
    )
    response = llm.invoke([HumanMessage(content=prompt)])
    return {"summaries": [response.content]}
  1. Build the LangGraph workflow so it splits first, then maps over chunks in parallel using Send, then reduces everything into one final summary. This is the pattern you want when documents are too large for one-pass prompting.
from langgraph.graph import StateGraph, START, END
from langgraph.types import Send

def route_chunks(state: DocState):
    return [Send("summarize_chunk", {"chunk": chunk}) for chunk in state["chunks"]]

def combine_summaries(state: DocState) -> dict:
    joined = "\n\n".join(state["summaries"])
    prompt = (
        "Combine these chunk summaries into one concise document summary. "
        "Preserve important facts and resolve repetition.\n\n"
        f"{joined}"
    )
    response = llm.invoke([HumanMessage(content=prompt)])
    return {"final_summary": response.content}

builder = StateGraph(DocState)
builder.add_node("split_document", split_document)
builder.add_node("summarize_chunk", summarize_chunk)
builder.add_node("combine_summaries", combine_summaries)

builder.add_edge(START, "split_document")
builder.add_conditional_edges("split_document", route_chunks)
builder.add_edge("summarize_chunk", "combine_summaries")
builder.add_edge("combine_summaries", END)

graph = builder.compile()
  1. Run it against a long document string. In real systems this would come from PDF extraction, OCR output, or a document store.
if __name__ == "__main__":
    os.environ["OPENAI_API_KEY"] = os.environ.get("OPENAI_API_KEY", "")

    long_doc = """
    Contract A includes a 12-month term beginning on March 1st.
    The customer must provide 30 days notice before cancellation.
    Service credits apply only if downtime exceeds 4 hours in a month.
    Data retention is limited to 90 days after termination.
    The vendor may audit usage once per quarter.
    """ * 40

    result = graph.invoke({"document": long_doc, "chunks": [], "summaries": [], "final_summary": ""})
    print(result["final_summary"])

Testing It

Run the script with a real OPENAI_API_KEY and confirm you get one final summary back instead of a truncated answer or token limit error. Then inspect the intermediate state by printing result["chunks"] and result["summaries"] to verify chunking and map-reduce behavior are working as expected.

If you want to test determinism, run it twice with temperature=0 and compare outputs. For larger validation, feed in a contract or policy document with repeated clauses and make sure the final summary preserves obligations like notice periods, exclusions, and retention rules.

Next Steps

  • Replace character-based splitting with token-aware splitting using tiktoken or LangChain text splitters.
  • Add retry logic and structured outputs for each chunk summary.
  • Persist intermediate states in a checkpointer so large document runs can resume after failure.

Keep learning

By Cyprian Aarons, AI Consultant at Topiax.

Want the complete 8-step roadmap?

Grab the free AI Agent Starter Kit — architecture templates, compliance checklists, and a 7-email deep-dive course.

Get the Starter Kit

Related Guides