LangGraph Tutorial (Python): chunking large documents for intermediate developers

By Cyprian AaronsUpdated 2026-04-22
langgraphchunking-large-documents-for-intermediate-developerspython

This tutorial shows how to build a LangGraph pipeline that splits large documents into manageable chunks, processes them in parallel, and returns structured output you can feed into retrieval, extraction, or summarization flows. You need this when a single document is too large for one model call, or when you want predictable chunk-level processing instead of dumping everything into one prompt.

What You'll Need

  • Python 3.10+
  • langgraph
  • langchain-core
  • langchain-text-splitters
  • pydantic
  • An OpenAI API key if you want to swap in LLM-based processing later
  • A large text file or any long string to test with

Install the packages:

pip install langgraph langchain-core langchain-text-splitters pydantic

Step-by-Step

  1. Start by defining the data shape for your graph state and loading the document. For chunking workflows, keep the raw text, the generated chunks, and the final results in separate fields so each node has a clear contract.
from typing import TypedDict, List
from langchain_text_splitters import RecursiveCharacterTextSplitter

class ChunkState(TypedDict):
    text: str
    chunks: List[str]
    results: List[str]

sample_text = """
LangGraph is useful when you need explicit control over multi-step LLM workflows.
Large documents often exceed context windows, so chunking is required.
This tutorial shows a production-friendly way to split and process text.
""" * 20
  1. Build a chunking function with RecursiveCharacterTextSplitter. This splitter is a good default because it respects paragraph and sentence boundaries before falling back to smaller separators.
def split_document(state: ChunkState) -> dict:
    splitter = RecursiveCharacterTextSplitter(
        chunk_size=300,
        chunk_overlap=50,
        separators=["\n\n", "\n", ". ", " ", ""],
    )
    chunks = splitter.split_text(state["text"])
    return {"chunks": chunks, "results": []}
  1. Add a per-chunk processor. In production this is where you would call an LLM, extract entities, classify sections, or generate embeddings; here we keep it deterministic so the example runs as-is.
def process_chunk(chunk: str) -> str:
    words = len(chunk.split())
    chars = len(chunk)
    return f"chunk_words={words}; chunk_chars={chars}; preview={chunk[:60].replace(chr(10), ' ')}"

def process_chunks(state: ChunkState) -> dict:
    results = [process_chunk(chunk) for chunk in state["chunks"]]
    return {"results": results}
  1. Wire the nodes together with LangGraph. The graph is simple here: split first, then process all chunks in one pass. That pattern maps cleanly to more advanced versions where you fan out into parallel workers.
from langgraph.graph import StateGraph, START, END

graph = StateGraph(ChunkState)
graph.add_node("split_document", split_document)
graph.add_node("process_chunks", process_chunks)

graph.add_edge(START, "split_document")
graph.add_edge("split_document", "process_chunks")
graph.add_edge("process_chunks", END)

app = graph.compile()
  1. Run the graph and inspect the output. The result should contain multiple chunks plus one processed record per chunk; if your input grows, the number of chunks should increase without changing your code.
if __name__ == "__main__":
    output = app.invoke({"text": sample_text})
    print(f"chunks={len(output['chunks'])}")
    print(f"results={len(output['results'])}")
    print(output["results"][0])
  1. If you want to make this useful for real document pipelines, replace process_chunk with an LLM call or structured extractor. Keep the same graph shape and only swap the implementation inside the node.
# Example replacement point:
# def process_chunk(chunk: str) -> str:
#     response = llm.invoke(f"Extract key facts from:\n\n{chunk}")
#     return response.content

Testing It

Run the script directly and confirm that chunks is greater than 1 for a long enough input. Then check that results matches the number of chunks exactly; if those counts diverge, your node contracts are wrong.

For a better test, replace sample_text with a real policy PDF converted to text or a long legal memo. You want to see stable chunk sizes and no empty outputs.

If you later add an LLM node, test with both short and very long inputs. Short inputs should produce one chunk; long inputs should produce multiple chunks without errors or truncation.

Next Steps

  • Add a parallel fan-out/fan-in pattern so each chunk gets its own node execution
  • Replace process_chunk with structured extraction using Pydantic models
  • Add metadata tracking for page numbers, section headers, and source offsets

Keep learning

By Cyprian Aarons, AI Consultant at Topiax.

Want the complete 8-step roadmap?

Grab the free AI Agent Starter Kit — architecture templates, compliance checklists, and a 7-email deep-dive course.

Get the Starter Kit

Related Guides