LlamaIndex Tutorial (TypeScript): persisting agent state for beginners

By Cyprian AaronsUpdated 2026-04-21

llamaindexpersisting-agent-state-for-beginnerstypescript

This tutorial shows you how to persist a LlamaIndex agent’s state in TypeScript so it can survive process restarts, redeploys, and background job retries. You need this when your agent holds useful context in memory, but you still want the conversation and tool history to be available after the server restarts.

What You'll Need

•Node.js 18+
•A TypeScript project with "type": "module" or compatible ESM setup
•llamaindex installed
•An OpenAI API key set as OPENAI_API_KEY
•A writable local directory for persisted state
•Basic familiarity with creating a LlamaIndex chat/agent workflow in TypeScript

Install the package if you have not already:

npm install llamaindex

Step-by-Step

•Create a small project setup and load your API key.
The persistence APIs in LlamaIndex work with a storage context, so start by making sure your environment is ready and your app can read the key from process.env.

import "dotenv/config";
import { Settings, OpenAI } from "llamaindex";

Settings.llm = new OpenAI({
  model: "gpt-4o-mini",
  apiKey: process.env.OPENAI_API_KEY,
});

if (!process.env.OPENAI_API_KEY) {
  throw new Error("OPENAI_API_KEY is not set");
}

•Build an agent that can hold state across turns.
For beginners, the simplest useful pattern is a chat engine or agent backed by a memory store. Here we use a persistent storage directory so the session data can be reloaded later.

import {
  Document,
  VectorStoreIndex,
  storageContextFromDefaults,
} from "llamaindex";

const docs = [
  new Document({ text: "Topiax handles claims intake for insurance workflows." }),
  new Document({ text: "Persisted agent state helps resume conversations after restart." }),
];

const storageContext = await storageContextFromDefaults({
  persistDir: "./storage",
});

const index = await VectorStoreIndex.fromDocuments(docs, {
  storageContext,
});

•Persist the index and any attached state to disk.
This is the critical step. Once persisted, the index files live under ./storage, which lets you reconstruct the same state later without rebuilding everything from scratch.

await index.storageContext.persist();
console.log("State persisted to ./storage");

•Reload the persisted state in a fresh process.
This simulates what happens after a server restart. Instead of re-ingesting documents, you load from disk and continue using the same stored data.

import { VectorStoreIndex, storageContextFromDefaults } from "llamaindex";

const reloadedStorage = await storageContextFromDefaults({
  persistDir: "./storage",
});

const reloadedIndex = await VectorStoreIndex.init({
  storageContext: reloadedStorage,
});

console.log("Reloaded index from disk");

•Query the reloaded state to confirm it works end-to-end.
If persistence is correct, the query should return an answer based on documents you indexed earlier, even though this code runs after reload.

const queryEngine = reloadedIndex.asQueryEngine();

const response = await queryEngine.query({
  query: "What does Topiax handle?",
});

console.log(String(response));

Here is the full example in one file so you can run it directly:

import "dotenv/config";
import {
  Document,
  OpenAI,
  Settings,
  VectorStoreIndex,
  storageContextFromDefaults,
} from "llamaindex";

Settings.llm = new OpenAI({
  model: "gpt-4o-mini",
  apiKey: process.env.OPENAI_API_KEY,
});

if (!process.env.OPENAI_API_KEY) {
  throw new Error("OPENAI_API_KEY is not set");
}

async function main() {
  const docs = [
    new Document({ text: "Topiax handles claims intake for insurance workflows." }),
    new Document({ text: "Persisted agent state helps resume conversations after restart." }),
  ];

  const storageContext = await storageContextFromDefaults({
    persistDir: "./storage",
  });

  const index = await VectorStoreIndex.fromDocuments(docs, {
    storageContext,
  });

  await index.storageContext.persist();

  const reloadedStorage = await storageContextFromDefaults({
    persistDir: "./storage",
  });

  const reloadedIndex = await VectorStoreIndex.init({
    storageContext: reloadedStorage,
  });

  const queryEngine = reloadedIndex.asQueryEngine();
  const response = await queryEngine.query({
    query: "What does Topiax handle?",
  });

  console.log(String(response));
}

main().catch((err) => {
  console.error(err);
  process.exit(1);
});

Testing It

Run the script once and confirm that ./storage gets created with persisted files inside it. Then stop the process and run it again; if reload works, it should answer from the previously stored data without rebuilding everything manually.

For a stronger test, delete only your in-memory variables and rerun just the reload section in a fresh Node process. If you get a relevant answer from the loaded index, persistence is working correctly.

If you are wiring this into an actual agent loop, verify that conversation-specific fields also survive between requests by storing them in the same persistence layer or external session store. The key check is simple: restart the app and make sure prior context still influences behavior.

Next Steps

•Add a proper chat memory layer on top of persistence for multi-turn conversations.
•Move persistDir to durable infrastructure like EBS, PVCs, or network-attached storage.
•Combine this with tool calling so your agent can resume both knowledge retrieval and action history after restart

Keep learning

•The complete AI Agents Roadmap — my full 8-step breakdown
•Free: The AI Agent Starter Kit — PDF checklist + starter code
•Work with me — I build AI for banks and insurance companies

By Cyprian Aarons, AI Consultant at Topiax.

ShareX / Twitter LinkedIn

Want the complete 8-step roadmap?

Grab the free AI Agent Starter Kit — architecture templates, compliance checklists, and a 7-email deep-dive course.

Get the Starter Kit