CrewAI Tutorial (Python): streaming agent responses for advanced developers

By Cyprian AaronsUpdated 2026-04-21

crewaistreaming-agent-responses-for-advanced-developerspython

This tutorial shows you how to stream agent output from a CrewAI workflow in Python instead of waiting for the full response at the end. You need this when you’re building interactive tools, long-running research agents, or any UI where users should see progress as the model generates it.

What You'll Need

•Python 3.10+
•crewai
•python-dotenv
•An OpenAI API key
•A terminal or notebook where you can run Python code
•Basic familiarity with CrewAI agents, tasks, and crews

Install the packages:

pip install crewai python-dotenv

Set your API key in your shell:

export OPENAI_API_KEY="your-key-here"

Step-by-Step

•Start by defining a minimal CrewAI setup with one agent and one task. For streaming, the important part is that you keep the task simple and let the execution layer emit incremental output.

from crewai import Agent, Task, Crew, Process

researcher = Agent(
    role="Research Analyst",
    goal="Summarize technical topics clearly",
    backstory="You write concise technical explanations for developers.",
    verbose=True,
)

task = Task(
    description="Explain what streaming responses are in one short paragraph.",
    expected_output="A concise explanation of streaming responses.",
    agent=researcher,
)

•Use a custom callback to capture tokens as they arrive. In practice, this is how you wire streaming into logs, a web socket, or a terminal UI.

from typing import Any

class StreamPrinter:
    def __init__(self):
        self.buffer = []

    def __call__(self, token: Any) -> None:
        text = str(token)
        self.buffer.append(text)
        print(text, end="", flush=True)

•Attach the callback to your agent’s LLM configuration and run the crew. This example uses OpenAI-compatible streaming through CrewAI’s underlying model config.

import os
from dotenv import load_dotenv

load_dotenv()

stream_printer = StreamPrinter()

researcher.llm = {
    "model": "gpt-4o-mini",
    "api_key": os.environ["OPENAI_API_KEY"],
    "stream": True,
    "callbacks": [stream_printer],
}

crew = Crew(
    agents=[researcher],
    tasks=[task],
    process=Process.sequential,
)

result = crew.kickoff()
print("\n\nFinal result:")
print(result)

•If you want cleaner production behavior, separate streaming from final persistence. The pattern is: stream tokens to the user interface, then store the final result once the task completes.

from pathlib import Path

output_path = Path("crew_output.txt")

final_text = str(result)
output_path.write_text(final_text, encoding="utf-8")

print(f"\nSaved final output to {output_path.resolve()}")

•For multi-task crews, keep only user-facing tasks streamed and leave internal tasks quiet. That gives you readable progress without flooding logs when one agent hands off to another.

planner = Agent(
    role="Planner",
    goal="Create structured plans",
    backstory="You produce clear step-by-step execution plans.",
)

writer = Agent(
    role="Writer",
    goal="Write concise technical answers",
    backstory="You turn plans into readable developer guidance.",
)

plan_task = Task(
    description="Create a 3-point plan for explaining streaming in CrewAI.",
    expected_output="A short plan.",
    agent=planner,
)

write_task = Task(
    description="Write the final explanation using the plan.",
    expected_output="A polished answer.",
    agent=writer,
)

crew = Crew(
    agents=[planner, writer],
    tasks=[plan_task, write_task],
    process=Process.sequential,
)

Testing It

Run the script from your terminal and watch for partial output before the final result prints. If streaming is wired correctly, you should see text appear incrementally instead of all at once after completion.

If nothing streams, check three things first: your API key is loaded, stream is set to True, and your callback is attached to the model config used by the agent. Also confirm your installed CrewAI version supports the configuration style shown here.

For a real test, replace the simple prompt with a longer task like summarizing a document or generating a multi-step plan. Streaming becomes obvious when generation takes more than a couple of seconds.

Next Steps

•Add a FastAPI endpoint that forwards streamed tokens to Server-Sent Events or WebSockets
•Wrap streamed output in structured events so your frontend can render “thinking”, “tool use”, and “final answer” separately
•Explore CrewAI task delegation patterns so only user-visible steps stream while background steps stay internal

Keep learning

•The complete AI Agents Roadmap — my full 8-step breakdown
•Free: The AI Agent Starter Kit — PDF checklist + starter code
•Work with me — I build AI for banks and insurance companies

By Cyprian Aarons, AI Consultant at Topiax.

ShareX / Twitter LinkedIn

Want the complete 8-step roadmap?

Grab the free AI Agent Starter Kit — architecture templates, compliance checklists, and a 7-email deep-dive course.

Get the Starter Kit