LlamaIndex Tutorial (Python): handling async tools for beginners

By Cyprian AaronsUpdated 2026-04-21

llamaindexhandling-async-tools-for-beginnerspython

This tutorial shows how to build a LlamaIndex agent in Python that can call async tools correctly without blocking your event loop. You need this when your tools hit databases, HTTP APIs, or internal services and you want the agent to stay responsive under concurrent load.

What You'll Need

•Python 3.10 or newer
•llama-index
•openai
•An OpenAI API key set as OPENAI_API_KEY
•
Optional but useful:
- •python-dotenv for loading environment variables from a .env file
- •httpx if you plan to wrap real async HTTP calls later

Install the core packages:

pip install llama-index openai python-dotenv

Set your API key:

export OPENAI_API_KEY="your-key-here"

Step-by-Step

•Start with a minimal LlamaIndex setup and define an async tool.
The important part here is that the tool function uses async def, because LlamaIndex can await it instead of treating it like a normal blocking function.

import asyncio
from llama_index.core.tools import FunctionTool

async def get_order_status(order_id: str) -> str:
    await asyncio.sleep(1)
    return f"Order {order_id} is currently in transit."

status_tool = FunctionTool.from_defaults(
    fn=get_order_status,
    name="get_order_status",
    description="Get the shipping status for an order by order_id.",
)

•Build an agent that knows how to use the tool.
For beginners, FunctionCallingAgentWorker is the cleanest path because it supports tool calling directly and works well with async functions.

from llama_index.core.agent.workflow import FunctionCallingAgentWorker
from llama_index.llms.openai import OpenAI

llm = OpenAI(model="gpt-4o-mini")

agent = FunctionCallingAgentWorker.from_tools(
    [status_tool],
    llm=llm,
    verbose=True,
).as_agent()

•Call the agent from an async entry point.
If your tool is async, your app should also be async at the top level. That keeps everything consistent and avoids event-loop issues later.

import asyncio

async def main() -> None:
    response = await agent.chat(
        "Check order 12345 and tell me its status."
    )
    print(response)

if __name__ == "__main__":
    asyncio.run(main())

•Add a second async tool so you can see multi-tool behavior.
In real projects, this is where you’d call another service like policy lookup, claims search, or customer profile retrieval.

import asyncio
from llama_index.core.tools import FunctionTool

async def get_customer_tier(customer_id: str) -> str:
    await asyncio.sleep(1)
    return f"Customer {customer_id} is on the Gold tier."

tier_tool = FunctionTool.from_defaults(
    fn=get_customer_tier,
    name="get_customer_tier",
    description="Get a customer's subscription tier by customer_id.",
)

•Wire both tools into the same agent and ask a combined question.
This is the real test: the agent should decide which tool to use, await each one correctly, and return a single answer.

from llama_index.core.agent.workflow import FunctionCallingAgentWorker
from llama_index.llms.openai import OpenAI

llm = OpenAI(model="gpt-4o-mini")

agent = FunctionCallingAgentWorker.from_tools(
    [status_tool, tier_tool],
    llm=llm,
    verbose=True,
).as_agent()

async def main() -> None:
    response = await agent.chat(
        "For customer 77, check their tier and also check order 12345."
    )
    print(response)

if __name__ == "__main__":
    asyncio.run(main())

Testing It

Run the script from your terminal and watch the verbose output. You should see the agent choose a tool, call it, wait for the async result, and then produce a final response.

If you see RuntimeError: This event loop is already running, you are probably calling asyncio.run() inside an environment that already manages its own loop, such as Jupyter. In that case, call await main() directly from the notebook cell instead of wrapping it again.

If the model responds without using your tools, check the tool descriptions and make them specific enough for routing. Also confirm that OPENAI_API_KEY is set and that your installed LlamaIndex version matches the imports shown here.

A good sanity check is to make each tool sleep for one second. If both tools are used in sequence, you’ll notice the delay; if they’re wired incorrectly, you’ll usually get either no delay or an exception about awaiting a non-coroutine.

Next Steps

•Wrap real async HTTP calls with httpx.AsyncClient instead of asyncio.sleep
•Add structured outputs so tools return JSON instead of plain strings
•Learn LlamaIndex workflows if you want multi-step orchestration beyond a single agent

Keep learning

•The complete AI Agents Roadmap — my full 8-step breakdown
•Free: The AI Agent Starter Kit — PDF checklist + starter code
•Work with me — I build AI for banks and insurance companies

By Cyprian Aarons, AI Consultant at Topiax.

ShareX / Twitter LinkedIn

Want the complete 8-step roadmap?

Grab the free AI Agent Starter Kit — architecture templates, compliance checklists, and a 7-email deep-dive course.

Get the Starter Kit