Haystack Tutorial (Python): persisting agent state for advanced developers
This tutorial shows how to persist Haystack agent state in Python so a multi-step agent can resume after process restarts, deploy cleanly across workers, and keep conversation context without rebuilding it every request. You need this when your agent is doing real work: customer support flows, claims triage, underwriting assistants, or any workflow where losing state means losing the task.
What You'll Need
- •Python 3.10+
- •
haystack-aiinstalled - •An OpenAI API key
- •A shell with environment variables set
- •Basic familiarity with Haystack pipelines and tools
Install the package:
pip install haystack-ai
Set your API key:
export OPENAI_API_KEY="your-key-here"
Step-by-Step
- •Start by creating a small tool and an agent that can use it. The important part here is that the agent has memory of prior turns, not just a single prompt/response exchange.
import os
from haystack import Pipeline, component
from haystack.components.generators.chat import OpenAIChatGenerator
from haystack.dataclasses import ChatMessage
@component
class PolicyLookup:
@component.output_types(result=str)
def run(self, query: str):
return {"result": f"Policy lookup result for: {query}"}
llm = OpenAIChatGenerator(model="gpt-4o-mini", api_key=os.environ["OPENAI_API_KEY"])
lookup = PolicyLookup()
pipeline = Pipeline()
pipeline.add_component("llm", llm)
pipeline.add_component("lookup", lookup)
- •Add a persistent state object. In production you usually store this in Redis, Postgres, or another durable backend; for the tutorial we will serialize to disk so you can see the full flow end to end.
import json
from dataclasses import asdict, dataclass, field
from pathlib import Path
@dataclass
class AgentState:
messages: list[dict] = field(default_factory=list)
def save(self, path: str) -> None:
Path(path).write_text(json.dumps(asdict(self), indent=2))
@classmethod
def load(cls, path: str) -> "AgentState":
p = Path(path)
if not p.exists():
return cls()
data = json.loads(p.read_text())
return cls(messages=data.get("messages", []))
state_path = "agent_state.json"
state = AgentState.load(state_path)
- •Convert the persisted state into Haystack
ChatMessageobjects before calling the model. This keeps your state format simple on disk while still feeding the LLM the exact message history it expects.
def to_chat_messages(messages: list[dict]) -> list[ChatMessage]:
return [ChatMessage.from_dict(m) for m in messages]
def append_user_message(state: AgentState, text: str) -> None:
state.messages.append(ChatMessage.from_user(text).to_dict())
def append_assistant_message(state: AgentState, text: str) -> None:
state.messages.append(ChatMessage.from_assistant(text).to_dict())
append_user_message(state, "What is my policy status?")
chat_messages = to_chat_messages(state.messages)
result = llm.run(messages=chat_messages)
assistant_text = result["replies"][0].content
append_assistant_message(state, assistant_text)
state.save(state_path)
print(assistant_text)
- •Now simulate a restart by loading the file again and continuing the conversation. This is the real test of persistence: if your process dies or a worker gets replaced, the next request should pick up from exactly where it left off.
state_after_restart = AgentState.load(state_path)
append_user_message(state_after_restart, "Also check whether my deductible applies.")
chat_messages = to_chat_messages(state_after_restart.messages)
result = llm.run(messages=chat_messages)
assistant_text = result["replies"][0].content
append_assistant_message(state_after_restart, assistant_text)
state_after_restart.save(state_path)
print(assistant_text)
- •If you want tool usage in the same persisted flow, keep tool results in state too. That way your agent can explain what it checked and you have an auditable trail of intermediate steps.
append_user_message(state_after_restart, "Look up claim CLM-1042 and summarize it.")
tool_result = lookup.run(query="claim CLM-1042")["result"]
state_after_restart.messages.append(
ChatMessage.from_assistant(f"Tool result: {tool_result}").to_dict()
)
chat_messages = to_chat_messages(state_after_restart.messages)
result = llm.run(messages=chat_messages)
assistant_text = result["replies"][0].content
append_assistant_message(state_after_restart, assistant_text)
state_after_restart.save(state_path)
print(tool_result)
print(assistant_text)
Testing It
Run the script once and confirm it creates agent_state.json. Then run it again without deleting the file and check that the second response includes context from earlier turns instead of starting fresh.
A good sanity check is to inspect the JSON file directly. You should see alternating user and assistant messages stored in order.
If you are wiring this into an API service, restart the server between requests and verify that the conversation still continues correctly. If it does not, your persistence layer is either not being written or not being reloaded at request start.
Next Steps
- •Replace the JSON file with Redis or Postgres for multi-instance deployments.
- •Add message trimming so old context does not grow without bound.
- •Store tool calls and retrieval results separately for auditability and replay.
Keep learning
- •The complete AI Agents Roadmap — my full 8-step breakdown
- •Free: The AI Agent Starter Kit — PDF checklist + starter code
- •Work with me — I build AI for banks and insurance companies
By Cyprian Aarons, AI Consultant at Topiax.
Want the complete 8-step roadmap?
Grab the free AI Agent Starter Kit — architecture templates, compliance checklists, and a 7-email deep-dive course.
Get the Starter Kit