LangGraph Tutorial (TypeScript): streaming agent responses for advanced developers

By Cyprian AaronsUpdated 2026-04-22
langgraphstreaming-agent-responses-for-advanced-developerstypescript

This tutorial shows how to stream intermediate and final agent output from a LangGraph app in TypeScript, so you can build UIs that update token-by-token or event-by-event. You need this when a chat app, support workflow, or analyst tool cannot wait for the full response before showing progress.

What You'll Need

  • Node.js 18+
  • TypeScript 5+
  • @langchain/langgraph
  • @langchain/openai
  • @langchain/core
  • An OpenAI API key in OPENAI_API_KEY
  • A terminal that can run ts-node, tsx, or compiled Node.js output

Install the packages:

npm install @langchain/langgraph @langchain/openai @langchain/core
npm install -D typescript tsx @types/node

Step-by-Step

  1. Start with a minimal graph state and a model node that supports streaming. For advanced apps, keep the state small and explicit so you can route, persist, and render partial results cleanly.
import { ChatOpenAI } from "@langchain/openai";
import { MessagesAnnotation, StateGraph } from "@langchain/langgraph";

const model = new ChatOpenAI({
  model: "gpt-4o-mini",
  temperature: 0,
});

const graph = new StateGraph(MessagesAnnotation)
  .addNode("agent", async (state) => {
    const response = await model.invoke(state.messages);
    return { messages: [response] };
  })
  .addEdge("__start__", "agent")
  .addEdge("agent", "__end__")
  .compile();
  1. Add a streaming entry point using stream(). The important part is choosing the right stream mode; for agent UIs, "messages" gives you token-level updates, while "updates" gives you node-level state changes.
async function main() {
  const input = {
    messages: [
      { role: "user", content: "Write a short incident summary for a failed payment retry job." },
    ],
  };

  const stream = await graph.stream(input, { streamMode: "messages" });

  for await (const chunk of stream) {
    const [message, metadata] = chunk;
    if (message?.content) {
      process.stdout.write(message.content);
    }
  }

  process.stdout.write("\n");
}

main().catch(console.error);
  1. If you want both live tokens and structured execution events, use multiple stream modes. This is useful when your frontend needs text for the user and metadata for logs, tracing, or progress bars.
async function runWithEvents() {
  const input = {
    messages: [
      { role: "user", content: "Explain why streaming helps in customer support workflows." },
    ],
  };

  const stream = await graph.stream(input, {
    streamMode: ["messages", "updates"],
  });

  for await (const event of stream) {
    if (event[0] === "messages") {
      const [, [message]] = event;
      if (message?.content) process.stdout.write(message.content);
    }

    if (event[0] === "updates") {
      const [, update] = event;
      console.error("\nNODE UPDATE:", JSON.stringify(update));
    }
  }
}

runWithEvents().catch(console.error);
  1. For production work, separate rendering from graph execution. Your API route should translate LangGraph events into SSE or WebSocket frames instead of printing to stdout.
type StreamFrame =
  | { type: "token"; text: string }
  | { type: "update"; node: string; data: unknown };

function toFrame(event: any): StreamFrame | null {
  if (event[0] === "messages") {
    const message = event[1][0];
    return message?.content ? { type: "token", text: message.content } : null;
  }

  if (event[0] === "updates") {
    const update = event[1];
    return { type: "update", node: Object.keys(update)[0], data: update };
  }

  return null;
}
  1. Use a real server handler to expose the stream. This pattern works well behind an internal API gateway because it keeps LangGraph logic isolated from transport concerns.
import http from "node:http";

http.createServer(async (_req, res) => {
  res.writeHead(200, {
    "Content-Type": "text/event-stream",
    "Cache-Control": "no-cache",
    Connection: "keep-alive",
  });

  const input = {
    messages: [{ role: "user", content: "Summarize the status of the queue processor." }],
  };

  const stream = await graph.stream(input, { streamMode: ["messages", "updates"] });

  for await (const event of stream) {
    const frame = toFrame(event);
    if (frame) res.write(`data: ${JSON.stringify(frame)}\n\n`);
  }

  res.end();
}).listen(3000);

Testing It

Run the script with OPENAI_API_KEY set and confirm that text appears incrementally instead of after the full completion. If you used "updates", verify that node-level events show up in stderr or your SSE consumer.

For API testing, hit your server with curl -N http://localhost:3000 and watch frames arrive one by one. If nothing streams until the end, check that your model call is inside a streaming-capable path and that your client is not buffering responses.

A good sanity check is to swap in a long prompt and confirm latency between chunks. In production, also verify cancellation behavior by closing the client connection mid-stream and ensuring your handler stops work cleanly.

Next Steps

  • Add tool nodes and stream both tool calls and assistant tokens
  • Persist thread state with checkpoints so interrupted streams can resume
  • Map LangGraph events into React Server Components or WebSocket clients

Keep learning

By Cyprian Aarons, AI Consultant at Topiax.

Want the complete 8-step roadmap?

Grab the free AI Agent Starter Kit — architecture templates, compliance checklists, and a 7-email deep-dive course.

Get the Starter Kit

Related Guides