LangGraph Tutorial (Python): handling long documents for intermediate developers
This tutorial builds a LangGraph workflow that ingests long documents, chunks them safely, summarizes each chunk, and merges the results into a final answer. You need this when a single document is too large for one model call, or when you want control over memory, cost, and failure recovery.
What You'll Need
- •Python 3.10+
- •
langgraph - •
langchain-core - •
langchain-openai - •An OpenAI API key in
OPENAI_API_KEY - •A long text document to test with
- •Basic familiarity with LangGraph nodes, state, and edges
Step-by-Step
- •Start by defining a state object that can carry the document, chunk list, partial summaries, and the final output. For long-document workflows, keep the state explicit so every node has a clear contract.
from typing import TypedDict, List
from langgraph.graph import StateGraph, START, END
class DocState(TypedDict):
document: str
chunks: List[str]
summaries: List[str]
final_summary: str
- •Next, create a chunking function. This example uses simple character-based splitting so it runs without extra dependencies, which is enough for a production baseline if you later replace it with token-aware chunking.
def split_document(state: DocState) -> dict:
text = state["document"]
chunk_size = 1200
overlap = 150
chunks = []
start = 0
while start < len(text):
end = min(start + chunk_size, len(text))
chunks.append(text[start:end])
start += chunk_size - overlap
return {"chunks": chunks, "summaries": []}
- •Now add an LLM-powered summarizer node for each chunk and a reducer node to combine all summaries. This pattern keeps each model call bounded and gives you deterministic control over how intermediate outputs are merged.
import os
from langchain_openai import ChatOpenAI
from langchain_core.messages import HumanMessage
llm = ChatOpenAI(model="gpt-4o-mini", temperature=0)
def summarize_chunk(state: DocState) -> dict:
summaries = []
for i, chunk in enumerate(state["chunks"], start=1):
prompt = (
f"Summarize chunk {i} of {len(state['chunks'])}.\n"
f"Focus on facts, decisions, names, dates, and risks.\n\n{chunk}"
)
response = llm.invoke([HumanMessage(content=prompt)])
summaries.append(response.content)
return {"summaries": summaries}
def merge_summaries(state: DocState) -> dict:
combined = "\n\n".join(
f"Chunk {i+1}: {summary}" for i, summary in enumerate(state["summaries"])
)
prompt = (
"You are consolidating summaries of a long document.\n"
"Produce one coherent final summary with key points and open questions.\n\n"
f"{combined}"
)
response = llm.invoke([HumanMessage(content=prompt)])
return {"final_summary": response.content}
- •Wire the graph together with three nodes: split, summarize, and merge. This is the part where LangGraph gives you structure instead of a single linear chain that becomes hard to debug once documents get large.
graph = StateGraph(DocState)
graph.add_node("split_document", split_document)
graph.add_node("summarize_chunk", summarize_chunk)
graph.add_node("merge_summaries", merge_summaries)
graph.add_edge(START, "split_document")
graph.add_edge("split_document", "summarize_chunk")
graph.add_edge("summarize_chunk", "merge_summaries")
graph.add_edge("merge_summaries", END)
app = graph.compile()
- •Finally, run it against a real document string. In practice you would load from PDF or DOCX first, but keeping the input as plain text makes the workflow easy to test and reuse.
if __name__ == "__main__":
document_text = """
Acme Insurance Policy Review 2024.
The policy renewal window opens on March 1st.
Claims over $50,000 require manual review.
The fraud detection team flagged three recurring patterns.
Customer support must respond within 24 hours for escalations.
"""
result = app.invoke(
{
"document": document_text * 20,
"chunks": [],
"summaries": [],
"final_summary": "",
}
)
print(result["final_summary"])
Testing It
Run the script with a valid OPENAI_API_KEY set in your environment. If the graph is wired correctly, you should see one final summary printed after several chunk-level LLM calls.
Test with a short document first so you can confirm the state transitions are correct. Then increase the input size until it crosses your model context limit; the workflow should still complete because each chunk is summarized independently.
If you want to inspect intermediate state during debugging, use app.stream(...) instead of app.invoke(...). That lets you verify that chunks and summaries are being populated before the merge step runs.
Next Steps
- •Replace character splitting with token-aware chunking using
tiktokenor a LangChain text splitter. - •Add parallel fan-out so chunk summarization runs concurrently instead of in one loop.
- •Persist graph state to a checkpointer so interrupted long-document jobs can resume cleanly.
Keep learning
- •The complete AI Agents Roadmap — my full 8-step breakdown
- •Free: The AI Agent Starter Kit — PDF checklist + starter code
- •Work with me — I build AI for banks and insurance companies
By Cyprian Aarons, AI Consultant at Topiax.
Want the complete 8-step roadmap?
Grab the free AI Agent Starter Kit — architecture templates, compliance checklists, and a 7-email deep-dive course.
Get the Starter Kit