LangChain Tutorial (TypeScript): streaming agent responses for intermediate developers

By Cyprian AaronsUpdated 2026-04-21
langchainstreaming-agent-responses-for-intermediate-developerstypescript

This tutorial shows you how to stream an agent’s responses token-by-token in TypeScript using LangChain. You need this when a full response is too slow for the UX, or when you want to show progress while the agent is still deciding and calling tools.

What You'll Need

  • Node.js 18+ installed
  • A TypeScript project initialized
  • langchain and @langchain/openai installed
  • An OpenAI API key set in OPENAI_API_KEY
  • A terminal that can run ts-node, tsx, or compiled Node output

Install the packages:

npm install langchain @langchain/openai zod
npm install -D typescript tsx @types/node

Set your API key:

export OPENAI_API_KEY="your-key-here"

Step-by-Step

  1. Start with a model that supports streaming. In LangChain, streaming is not a separate architecture; it’s just a model configured to emit chunks as they arrive.
import { ChatOpenAI } from "@langchain/openai";

const model = new ChatOpenAI({
  model: "gpt-4o-mini",
  temperature: 0,
  streaming: true,
});
  1. Define a tool so the agent has something real to do. Streaming becomes useful once the agent can think, decide, and call tools while your UI stays responsive.
import { tool } from "@langchain/core/tools";
import { z } from "zod";

const getPolicyStatus = tool(
  async ({ policyId }) => {
    return `Policy ${policyId} is active and next payment is due on 2026-05-01.`;
  },
  {
    name: "get_policy_status",
    description: "Get the status of an insurance policy by ID.",
    schema: z.object({
      policyId: z.string().describe("The policy identifier"),
    }),
  }
);
  1. Create the agent and executor. Use a tool-calling agent so LangChain can stream the assistant’s final response after any tool calls complete.
import { createToolCallingAgent } from "langchain/agents";
import { AgentExecutor } from "langchain/agents";
import { ChatPromptTemplate, MessagesPlaceholder } from "@langchain/core/prompts";

const prompt = ChatPromptTemplate.fromMessages([
  ["system", "You are a helpful insurance support assistant."],
  ["human", "{input}"],
  new MessagesPlaceholder("agent_scratchpad"),
]);

const agent = await createToolCallingAgent({
  llm: model,
  tools: [getPolicyStatus],
  prompt,
});

const executor = new AgentExecutor({
  agent,
  tools: [getPolicyStatus],
});
  1. Stream the response chunks to stdout. This is the part most people miss: you don’t call invoke() if you want incremental output; you use stream() and consume async chunks.
const input = {
  input: "Check policy ABC123 and summarize its status for me.",
};

for await (const chunk of await executor.stream(input)) {
  if (chunk.messages?.length) {
    const last = chunk.messages[chunk.messages.length - 1];
    if ("content" in last && typeof last.content === "string") {
      process.stdout.write(last.content);
    }
  }
}
process.stdout.write("\n");
  1. Put it together in one runnable file. This version includes imports, tool setup, prompt setup, and streaming output end-to-end.
import { ChatOpenAI } from "@langchain/openai";
import { tool } from "@langchain/core/tools";
import { z } from "zod";
import { createToolCallingAgent, AgentExecutor } from "langchain/agents";
import { ChatPromptTemplate, MessagesPlaceholder } from "@langchain/core/prompts";

const model = new ChatOpenAI({
  model: "gpt-4o-mini",
  temperature: 0,
  streaming: true,
});

const getPolicyStatus = tool(
  async ({ policyId }) => `Policy ${policyId} is active and next payment is due on 2026-05-01.`,
  {
    name: "get_policy_status",
    description: "Get the status of an insurance policy by ID.",
    schema: z.object({ policyId: z.string() }),
  }
);

const prompt = ChatPromptTemplate.fromMessages([
  ["system", "You are a helpful insurance support assistant."],
  ["human", "{input}"],
  new MessagesPlaceholder("agent_scratchpad"),
]);

const agent = await createToolCallingAgent({
  llm: model,
  tools: [getPolicyStatus],
  prompt,
});

const executor = new AgentExecutor({ agent, tools: [getPolicyStatus] });

for await (const chunk of await executor.stream({
  input: "Check policy ABC123 and summarize its status for me.",
})) {
  const message = chunk.messages?.[chunk.messages.length - 1];
  if (message && typeof message.content === "string") {
    process.stdout.write(message.content);
  }
}
process.stdout.write("\n");

Testing It

Run the file with tsx or compile it with tsc first. You should see output appear incrementally instead of waiting for one big response at the end.

If nothing streams until completion, check that streaming: true is set on the chat model and that you are using executor.stream(...), not executor.invoke(...). If you want to verify tool behavior, change the user input to explicitly request a policy lookup and confirm the final answer includes the mock status text.

For a more realistic test, add latency inside the tool with await new Promise((r) => setTimeout(r, 1000)). That makes it obvious whether your UI stays responsive while the agent reasons.

Next Steps

  • Add structured event handling with LangChain callbacks so you can stream tokens, tool events, and intermediate steps separately.
  • Replace stdout with Server-Sent Events or WebSockets for browser clients.
  • Swap the mock tool for a real backend integration against your internal policy service or CRM system.

Keep learning

By Cyprian Aarons, AI Consultant at Topiax.

Want the complete 8-step roadmap?

Grab the free AI Agent Starter Kit — architecture templates, compliance checklists, and a 7-email deep-dive course.

Get the Starter Kit

Related Guides