LlamaIndex Tutorial (Python): persisting agent state for beginners
This tutorial shows you how to persist a LlamaIndex agent’s state to disk in Python and load it back later. You need this when you want an agent to remember its tools, chat history, or workflow state across process restarts instead of starting from zero every time.
What You'll Need
- •Python 3.10+
- •
llama-indexinstalled - •An OpenAI API key set as
OPENAI_API_KEY - •A writable local directory for persistence
- •Basic familiarity with
ReActAgentand LlamaIndex tools
Install the package:
pip install llama-index
Step-by-Step
- •Start by creating a simple agent with one tool. The key point here is that the agent will later be wrapped in a workflow that supports persistence.
from llama_index.core.agent import ReActAgent
from llama_index.core.tools import FunctionTool
from llama_index.llms.openai import OpenAI
def multiply(a: int, b: int) -> int:
return a * b
tool = FunctionTool.from_defaults(fn=multiply)
llm = OpenAI(model="gpt-4o-mini")
agent = ReActAgent.from_tools(
tools=[tool],
llm=llm,
verbose=True,
)
- •Add a persistent memory object and point it at a local directory. This is what lets the agent recover its internal state after the process exits.
import os
from llama_index.core.agent.workflow import AgentWorkflow
from llama_index.core.storage.chat_store import SimpleChatStore
from llama_index.core.memory import Memory
persist_dir = "./agent_state"
os.makedirs(persist_dir, exist_ok=True)
chat_store = SimpleChatStore()
memory = Memory.from_defaults(
chat_store=chat_store,
chat_store_key="demo_session",
token_limit=4000,
)
workflow = AgentWorkflow.from_tools_or_functions(
tools=[tool],
llm=llm,
memory=memory,
)
- •Run the agent once, then persist the workflow state and chat store to disk. In practice, you save both because the conversation history lives in the chat store while the workflow keeps the agent’s execution state.
import json
response = workflow.run("What is 6 times 7?")
print(response)
chat_store.persist(persist_dir=persist_dir)
state_path = os.path.join(persist_dir, "workflow_state.json")
with open(state_path, "w", encoding="utf-8") as f:
json.dump(
{
"chat_store_key": "demo_session",
},
f,
indent=2,
)
- •Simulate a restart by creating fresh objects and loading the persisted data back. This is the part beginners usually miss: if you rebuild everything from scratch without loading storage, the agent will not remember anything.
import json
from llama_index.core.storage.chat_store import SimpleChatStore
from llama_index.core.memory import Memory
with open("./agent_state/workflow_state.json", "r", encoding="utf-8") as f:
saved_state = json.load(f)
loaded_chat_store = SimpleChatStore.from_persist_dir("./agent_state")
loaded_memory = Memory.from_defaults(
chat_store=loaded_chat_store,
chat_store_key=saved_state["chat_store_key"],
token_limit=4000,
)
reloaded_workflow = AgentWorkflow.from_tools_or_functions(
tools=[tool],
llm=llm,
memory=loaded_memory,
)
- •Ask a follow-up question and confirm the agent still has context from before. If persistence worked, it should continue from the same session instead of behaving like a brand-new assistant.
follow_up = reloaded_workflow.run("What was my previous question?")
print(follow_up)
another_answer = reloaded_workflow.run("Now multiply 8 by 9.")
print(another_answer)
Testing It
Run the script once and confirm it answers 42 for 6 times 7. Then stop the process, rerun it, and ask a follow-up like “What was my previous question?”; if persistence is working, the agent should reference the earlier interaction stored in demo_session.
Check that ./agent_state/ contains persisted files after the first run. If you delete that directory and rerun, the agent should lose its memory, which is a good sanity check that your persistence path is actually being used.
If you want stronger verification, print out stored messages from loaded_chat_store before running the second query. That lets you confirm that conversation history is being restored before any new inference happens.
Next Steps
- •Move from
SimpleChatStoreto a production-backed store like Redis or Postgres for multi-process deployments. - •Add per-user session keys so each customer gets isolated memory.
- •Persist more than chat history by looking into LlamaIndex workflow/state serialization patterns for longer-running agents.
Keep learning
- •The complete AI Agents Roadmap — my full 8-step breakdown
- •Free: The AI Agent Starter Kit — PDF checklist + starter code
- •Work with me — I build AI for banks and insurance companies
By Cyprian Aarons, AI Consultant at Topiax.
Want the complete 8-step roadmap?
Grab the free AI Agent Starter Kit — architecture templates, compliance checklists, and a 7-email deep-dive course.
Get the Starter Kit