LlamaIndex Tutorial (TypeScript): testing agents locally for intermediate developers

By Cyprian AaronsUpdated 2026-04-21

llamaindextesting-agents-locally-for-intermediate-developerstypescript

This tutorial shows you how to run a LlamaIndex agent locally in TypeScript, wire it to a simple tool, and test it without deploying anything. You need this when you want fast iteration on agent behavior, deterministic debugging, and a safe place to validate tool calls before exposing the agent to real users.

What You'll Need

•Node.js 18+ installed
•A TypeScript project with ts-node or tsx
•
LlamaIndex packages:
- •llamaindex
•
An OpenAI API key for the LLM:
- •OPENAI_API_KEY
•
Optional but useful:
- •dotenv for local env loading
- •vitest or jest if you want automated tests later

Install the minimum set:

npm install llamaindex
npm install -D typescript tsx @types/node

If you want environment variable loading:

npm install dotenv

Step-by-Step

•Create a small TypeScript entry point and load your API key from the environment. Keep this file boring; the goal is to make agent behavior easy to inspect locally.

import "dotenv/config";
import { OpenAI } from "llamaindex";

if (!process.env.OPENAI_API_KEY) {
  throw new Error("OPENAI_API_KEY is required");
}

const llm = new OpenAI({
  model: "gpt-4o-mini",
});

console.log("LLM ready:", llm.model);

•Add a local tool that the agent can call. For testing, use something deterministic like a calculator so you can verify whether the agent is actually invoking tools instead of hallucinating answers.

import { FunctionTool } from "llamaindex";

const calculateTool = FunctionTool.from(
  async ({ expression }: { expression: string }) => {
    const allowed = /^[0-9+\-*/().\s]+$/;
    if (!allowed.test(expression)) {
      return "Invalid expression";
    }

    const result = Function(`"use strict"; return (${expression});`)();
    return String(result);
  },
  {
    name: "calculate",
    description: "Evaluate a basic arithmetic expression.",
    parameters: {
      type: "object",
      properties: {
        expression: {
          type: "string",
          description: "Arithmetic expression like (12 + 8) / 4",
        },
      },
      required: ["expression"],
    },
  }
);

console.log("Tool ready:", calculateTool.metadata.name);

•Build an agent around the model and tool. The important part here is that you keep the prompt narrow so you can see whether tool routing works under controlled conditions.

import { OpenAIAgent } from "llamaindex";

const agent = new OpenAIAgent({
  tools: [calculateTool],
  llm,
  systemPrompt:
    "You are a precise assistant. Use the calculate tool for arithmetic questions.",
});

async function main() {
  const response = await agent.chat({
    message: "What is (18 + 6) / 3?",
  });

  console.log(response.response);
}

main().catch(console.error);

•Run it locally and inspect the output. If the model is behaving correctly, it should call the tool for arithmetic and return the computed answer instead of free-styling the math.

npx tsx src/index.ts

•Add a repeatable local test harness so you can verify behavior during development. This is where most teams stop relying on manual chat prompts and start catching regressions early.

import assert from "node:assert/strict";
import { OpenAIAgent, OpenAI, FunctionTool } from "llamaindex";

async function runTest() {
  const llm = new OpenAI({ model: "gpt-4o-mini" });

  const calculateTool = FunctionTool.from(
    async ({ expression }: { expression: string }) => String(eval(expression)),
    {
      name: "calculate",
      description: "Evaluate basic arithmetic.",
      parameters: {
        type: "object",
        properties: {
          expression: { type: "string" },
        },
        required: ["expression"],
      },
    }
  );

  const agent = new OpenAIAgent({
    tools: [calculateTool],
    llm,
    systemPrompt: "Use tools for math.",
  });

  const response = await agent.chat({ message: "What is 10 * (2 + 3)?" });
  assert.match(response.response, /50/);
}

runTest().then(() => console.log("test passed"));

Testing It

Run the script twice with slightly different prompts, such as What is (18 + 6) / 3? and Calculate 7 * 8. You should see stable answers, and if you log intermediate steps later, you should be able to confirm that the tool was called rather than guessed.

If you get authentication errors, check OPENAI_API_KEY first. If the output looks wrong, tighten the system prompt and make sure your tool schema matches what the model expects.

For local regression testing, wrap the agent call in an assertion-based script like above and run it in CI. That gives you a cheap smoke test every time someone changes prompts, tools, or model settings.

Next Steps

•Add structured tracing so you can inspect each tool call and model response during debugging.
•Replace the calculator with a real internal tool, like policy lookup or claim status retrieval.
•Move these assertions into Vitest so local checks become part of your normal test suite.

Keep learning

•The complete AI Agents Roadmap — my full 8-step breakdown
•Free: The AI Agent Starter Kit — PDF checklist + starter code
•Work with me — I build AI for banks and insurance companies

By Cyprian Aarons, AI Consultant at Topiax.

ShareX / Twitter LinkedIn

Want the complete 8-step roadmap?

Grab the free AI Agent Starter Kit — architecture templates, compliance checklists, and a 7-email deep-dive course.

Get the Starter Kit