Haystack Tutorial (TypeScript): streaming agent responses for advanced developers

By Cyprian AaronsUpdated 2026-04-21
haystackstreaming-agent-responses-for-advanced-developerstypescript

This tutorial shows you how to build a TypeScript agent that streams responses token-by-token with Haystack instead of waiting for the full answer. You need this when your UI must feel responsive, when you want to surface partial outputs to users, or when you need to inspect agent behavior as it happens.

What You'll Need

  • Node.js 18+ and npm
  • A TypeScript project with ts-node or a build step via tsc
  • Haystack installed from npm
  • An OpenAI API key in OPENAI_API_KEY
  • A terminal that supports streaming output
  • Basic familiarity with Haystack agents, tools, and chat messages

Install the package first:

npm install haystack-core
npm install -D typescript ts-node @types/node

Set your API key:

export OPENAI_API_KEY="your-key-here"

Step-by-Step

  1. Start with a minimal TypeScript file that wires up the model, tool, and agent. The key detail here is that the agent must be configured to use a chat model capable of streaming.
import { OpenAIChatGenerator } from "haystack-core";
import { Agent } from "haystack-core/agents";
import { Tool } from "haystack-core/tools";

const calculatorTool = new Tool({
  name: "calculator",
  description: "Evaluate simple math expressions",
  parameters: {
    type: "object",
    properties: {
      expression: { type: "string" },
    },
    required: ["expression"],
  },
  function: async ({ expression }: { expression: string }) => {
    return String(eval(expression));
  },
});

const generator = new OpenAIChatGenerator({
  model: "gpt-4o-mini",
});

const agent = new Agent({
  llm: generator,
  tools: [calculatorTool],
});
  1. Add a streaming consumer so you can print partial tokens as they arrive. In production, this is where you'd push events into SSE, WebSockets, or your frontend state store.
async function main() {
  const stream = await agent.runStreaming({
    messages: [
      {
        role: "user",
        content: "What is (12 * 8) + 14? Show your reasoning briefly.",
      },
    ],
  });

  for await (const event of stream) {
    if (event.type === "token") {
      process.stdout.write(event.token);
    }

    if (event.type === "tool_call") {
      console.log(`\n[tool] ${event.toolName}(${JSON.stringify(event.arguments)})`);
    }
  }

  process.stdout.write("\n");
}
  1. If you want structured control over what gets streamed, separate final text from tool events. This keeps your logs clean and makes it easier to debug multi-step agent execution.
async function runAndCapture() {
  let finalText = "";

  const stream = await agent.runStreaming({
    messages: [
      {
        role: "user",
        content: "Calculate the result of (19 + 6) * 3 using the calculator tool.",
      },
    ],
  });

  for await (const event of stream) {
    switch (event.type) {
      case "token":
        finalText += event.token;
        process.stdout.write(event.token);
        break;
      case "tool_call":
        console.error(`\nCalling tool: ${event.toolName}`);
        break;
      case "tool_result":
        console.error(`\nTool returned: ${event.result}`);
        break;
    }
  }

  console.log("\n\nFinal text captured:");
  console.log(finalText);
}
  1. Wrap the runner in an executable entry point and handle errors explicitly. Streaming code tends to fail in three places: missing env vars, bad model config, or tool exceptions.
async function main() {
  try {
    if (!process.env.OPENAI_API_KEY) {
      throw new Error("OPENAI_API_KEY is not set");
    }

    await runAndCapture();
  } catch (error) {
    const message = error instanceof Error ? error.message : String(error);
    console.error("Agent failed:", message);
    process.exitCode = 1;
  }
}

main();
  1. Run the script directly with ts-node, then move it into your app once it works. If you're building an API service, the same for await loop can feed chunked HTTP responses.
npx ts-node agent-stream.ts

Testing It

Run the script and confirm you see text appearing incrementally instead of all at once. Then ask a question that forces a tool call so you can verify both token streaming and tool events are emitted.

A good test prompt is one that requires calculation or retrieval, because it exercises more than plain text generation. If nothing streams until the end, check whether your model supports streaming and whether your runtime is buffering stdout.

For integration testing, wrap agent.runStreaming() in an HTTP endpoint and assert that clients receive chunks in order. In real systems, also verify cancellation behavior so abandoned requests do not keep burning tokens.

Next Steps

  • Add Server-Sent Events or WebSocket transport so browser clients can consume the stream directly.
  • Replace the calculator tool with a real internal tool like customer lookup, policy lookup, or claims status.
  • Add tracing around tool_call, tool_result, and token events so you can audit agent behavior in production.

Keep learning

By Cyprian Aarons, AI Consultant at Topiax.

Want the complete 8-step roadmap?

Grab the free AI Agent Starter Kit — architecture templates, compliance checklists, and a 7-email deep-dive course.

Get the Starter Kit

Related Guides