LangChain Tutorial (Python): streaming agent responses for beginners
This tutorial shows you how to build a LangChain agent in Python that streams its responses token by token instead of waiting for the full answer. You need this when you want a chat UI, CLI, or backend service to feel responsive while the model is still generating output.
What You'll Need
- •Python 3.10+
- •An OpenAI API key
- •These packages:
- •
langchain - •
langchain-openai - •
langchain-community - •
python-dotenv
- •
- •A terminal and a virtual environment
- •Basic familiarity with LangChain agents and tools
Step-by-Step
- •Set up your environment and install the packages. Keep your API key in an
.envfile so you do not hardcode credentials into your codebase.
python -m venv .venv
source .venv/bin/activate
pip install langchain langchain-openai langchain-community python-dotenv
Create a .env file:
OPENAI_API_KEY=your_openai_api_key_here
- •Create a simple tool the agent can call. Streaming works best when you can see both tool usage and final answer generation in real time.
from langchain_core.tools import tool
@tool
def get_account_status(account_id: str) -> str:
"""Return a mock account status for a given account ID."""
mock_db = {
"1001": "Account 1001 is active with no overdue balance.",
"1002": "Account 1002 is pending verification.",
}
return mock_db.get(account_id, f"Account {account_id} not found.")
- •Build a streaming chat model and an agent prompt. The key detail is
streaming=True, which tells the model to emit chunks as they arrive.
import os
from dotenv import load_dotenv
from langchain_openai import ChatOpenAI
from langchain_core.prompts import ChatPromptTemplate, MessagesPlaceholder
load_dotenv()
llm = ChatOpenAI(
model="gpt-4o-mini",
temperature=0,
streaming=True,
)
prompt = ChatPromptTemplate.from_messages([
("system", "You are a helpful support agent."),
("human", "{input}"),
MessagesPlaceholder(variable_name="agent_scratchpad"),
])
- •Create the agent and executor. This is the part that wires the LLM, prompt, and tool together into something that can reason and act.
from langchain.agents import create_openai_tools_agent, AgentExecutor
tools = [get_account_status]
agent = create_openai_tools_agent(llm=llm, tools=tools, prompt=prompt)
agent_executor = AgentExecutor(
agent=agent,
tools=tools,
verbose=True,
)
- •Stream the response chunk by chunk in your terminal. Use
astream_eventsso you can inspect tokens as they are produced instead of waiting for the full result.
import asyncio
async def main():
user_input = "Check account 1001 and explain the status briefly."
async for event in agent_executor.astream_events(
{"input": user_input},
version="v2",
):
if event["event"] == "on_chat_model_stream":
chunk = event["data"]["chunk"]
if chunk.content:
print(chunk.content, end="", flush=True)
if __name__ == "__main__":
asyncio.run(main())
- •Add a cleaner fallback for non-streaming output if you want to debug tool behavior first. This helps when you want to compare the final answer with streamed events during development.
result = agent_executor.invoke({"input": "Check account 1002 and summarize it."})
print("\n\nFinal result:")
print(result["output"])
Testing It
Run the script from your terminal and watch for partial text appearing before the full answer completes. If everything is wired correctly, you should see the agent think through the request, call get_account_status, then continue streaming its final response.
If you only see one complete block at the end, check that streaming=True is set on ChatOpenAI and that you are using astream_events, not invoke. Also confirm your OpenAI key is loaded correctly from .env.
For a better test, try prompts that force tool use, like asking about account 1001 or 1002. That makes it obvious whether the agent is actually calling tools before generating its final answer.
Next Steps
- •Add more tools, such as policy lookup or claims status functions.
- •Stream events into a FastAPI endpoint or WebSocket for a real frontend.
- •Learn how to handle intermediate tool-call events separately from token streaming.
Keep learning
- •The complete AI Agents Roadmap — my full 8-step breakdown
- •Free: The AI Agent Starter Kit — PDF checklist + starter code
- •Work with me — I build AI for banks and insurance companies
By Cyprian Aarons, AI Consultant at Topiax.
Want the complete 8-step roadmap?
Grab the free AI Agent Starter Kit — architecture templates, compliance checklists, and a 7-email deep-dive course.
Get the Starter Kit