How to Fix 'async event loop error during development' in LlamaIndex (Python)

By Cyprian AaronsUpdated 2026-04-21
async-event-loop-error-during-developmentllamaindexpython

If you’re seeing RuntimeError: This event loop is already running or RuntimeError: asyncio.run() cannot be called from a running event loop while developing with LlamaIndex, you’re hitting an async/sync boundary problem. It usually shows up in notebooks, FastAPI apps, Streamlit, or any environment that already owns the event loop.

In LlamaIndex, this often happens when you call a synchronous helper that internally tries to start its own loop, while your app is already inside async def code or an interactive runtime.

The Most Common Cause

The #1 cause is calling asyncio.run() or a sync LlamaIndex wrapper from inside an already-running event loop.

This happens a lot with code like index.as_query_engine().query(...), VectorStoreIndex.from_documents(...) inside notebook cells, or custom wrappers around RetrieverQueryEngine and QueryEngineTool that mix sync and async calls.

Broken vs fixed pattern

BrokenFixed
Starts a new loop inside an existing async contextUses await all the way through
Common in Jupyter, FastAPI, StreamlitWorks in async-native apps
# BROKEN
import asyncio
from llama_index.core import VectorStoreIndex, SimpleDirectoryReader

docs = SimpleDirectoryReader("./data").load_data()
index = VectorStoreIndex.from_documents(docs)

async def main():
    query_engine = index.as_query_engine()
    # This can trigger:
    # RuntimeError: This event loop is already running
    result = asyncio.run(query_engine.aquery("What is in the documents?"))
    print(result)

asyncio.run(main())
# FIXED
from llama_index.core import VectorStoreIndex, SimpleDirectoryReader

docs = SimpleDirectoryReader("./data").load_data()
index = VectorStoreIndex.from_documents(docs)

async def main():
    query_engine = index.as_query_engine()
    result = await query_engine.aquery("What is in the documents?")
    print(result)

# In a script:
import asyncio
asyncio.run(main())

If you’re already inside a framework-managed async function, don’t call asyncio.run() at all. Use await directly.

Other Possible Causes

1. Using sync LlamaIndex APIs inside async framework handlers

FastAPI routes, Starlette endpoints, and some LangServe-style handlers run on an active event loop. Calling .query() instead of .aquery() can force blocking behavior and trigger loop issues later.

# BROKEN
@app.get("/search")
async def search(q: str):
    response = query_engine.query(q)  # sync call in async handler
    return {"answer": str(response)}
# FIXED
@app.get("/search")
async def search(q: str):
    response = await query_engine.aquery(q)
    return {"answer": str(response)}

2. Notebook/Jupyter auto-await conflicts

Jupyter already runs an event loop. If you use asyncio.run() or create your own loop manually, you’ll get:

  • RuntimeError: This event loop is already running
  • sometimes followed by RuntimeWarning: coroutine was never awaited
# BROKEN in Jupyter
import asyncio
result = asyncio.run(query_engine.aquery("Summarize this"))
# FIXED in Jupyter
result = await query_engine.aquery("Summarize this")

3. Mixing callback handlers that block the loop

Some custom callback handlers or instrumentation code does blocking I/O inside async callbacks. That can surface as an event-loop failure when LlamaIndex emits events through classes like CallbackManager.

# BROKEN
from llama_index.core.callbacks import CallbackManager

class MyHandler:
    def on_event_start(self, *args, **kwargs):
        import time
        time.sleep(2)  # blocks the loop

callback_manager = CallbackManager([MyHandler()])

Use non-blocking code or move heavy work out of the callback path.

4. Creating nested loops in worker threads or task runners

If you’re using Celery, background jobs, or custom thread pools and calling async LlamaIndex methods incorrectly, you can end up with:

  • RuntimeError: There is no current event loop in thread
  • RuntimeError: Event loop is closed
# BROKEN
def worker():
    asyncio.run(query_engine.aquery("Check policy terms"))

Prefer one top-level async entrypoint per process. If you must bridge sync to async in worker code, manage the boundary once and keep it consistent.

How to Debug It

  1. Check whether your code path is already async

    • If you’re inside async def, do not use asyncio.run().
    • Replace .query() with .aquery() and .chat() with .achat() where available.
  2. Look at the exact traceback

    • RuntimeError: This event loop is already running
    • RuntimeError: asyncio.run() cannot be called from a running event loop
    • RuntimeWarning: coroutine was never awaited

    These point to different boundary mistakes.

  3. Find the first sync call into LlamaIndex

    • Search for .query(, .chat(, .retrieve(, or any wrapper that hides an internal coroutine.
    • Check whether your app framework expects async handlers.
  4. Test outside the framework

    • Run the same LlamaIndex call in a plain Python script.
    • If it works there but fails in FastAPI/Jupyter/Streamlit, the issue is your runtime integration, not LlamaIndex itself.

Prevention

  • Use async end-to-end when your app framework is async.
    • Pair aquery, achat, and other async methods with await.
  • Don’t wrap coroutine calls with asyncio.run() unless you control the whole process entrypoint.
  • Keep notebook code notebook-native.
    • In Jupyter use await ..., not manual loop management.
  • Add one integration test for each runtime you support:
    • plain script
    • notebook-style execution
    • web handler / API route

Keep learning

By Cyprian Aarons, AI Consultant at Topiax.

Want the complete 8-step roadmap?

Grab the free AI Agent Starter Kit — architecture templates, compliance checklists, and a 7-email deep-dive course.

Get the Starter Kit

Related Guides