How to Fix 'invalid API key in production' in LlamaIndex (Python)

By Cyprian AaronsUpdated 2026-04-21
invalid-api-key-in-productionllamaindexpython

When you see invalid API key in production from a LlamaIndex app, it usually means the process is not using the credential you think it is. In practice, this shows up when your local .env works, but production uses a different environment, a stale secret, or a provider client that was initialized before the right key was loaded.

The exact failure often bubbles up through OpenAI-style clients used by LlamaIndex, for example:

  • openai.AuthenticationError: Error code: 401 - {'error': {'message': 'Invalid API key provided'}}
  • ValueError: Invalid API key
  • llama_index.core.llms.llm.LLMMetadata never gets that far because the provider client fails during initialization

The Most Common Cause

The #1 cause is simple: you set the API key too late, or you rely on os.environ being updated after LlamaIndex has already created its LLM/embedding client.

This happens a lot in FastAPI, Celery, Docker, and serverless deployments where module-level code runs on import.

Wrong pattern vs right pattern

BrokenFixed
Loads OpenAI() before env vars are availableLoads env vars first, then creates the client
Uses module-level singleton initialized at import timeInitializes inside startup/bootstrap code
Works locally because .env is loaded by your shellFails in production because the process environment is different
# broken.py
import os
from dotenv import load_dotenv
from llama_index.llms.openai import OpenAI

llm = OpenAI(model="gpt-4o")  # reads env too early

load_dotenv()  # too late
os.environ["OPENAI_API_KEY"] = "sk-prod-..."  # also too late for already-created client

# later...
response = llm.complete("Hello")
# fixed.py
import os
from dotenv import load_dotenv
from llama_index.llms.openai import OpenAI

load_dotenv()  # load first

api_key = os.environ["OPENAI_API_KEY"]
llm = OpenAI(
    model="gpt-4o",
    api_key=api_key,
)

response = llm.complete("Hello")

If you’re using the newer LlamaIndex package layout, the same rule applies to Settings:

from dotenv import load_dotenv
from llama_index.core import Settings
from llama_index.llms.openai import OpenAI

load_dotenv()

Settings.llm = OpenAI(
    model="gpt-4o",
    api_key=os.environ["OPENAI_API_KEY"],
)

Other Possible Causes

1) Wrong environment variable name

LlamaIndex integrations often expect provider-specific names. For OpenAI-style clients, OPENAI_API_KEY is common. For Azure OpenAI, you need different values like endpoint and deployment name.

# broken
LLM_API_KEY=sk-prod-123

# fixed
OPENAI_API_KEY=sk-prod-123

If you pass the key explicitly, make sure it’s going to the right constructor argument:

# broken
OpenAI(model="gpt-4o", token=os.environ["OPENAI_API_KEY"])

# fixed
OpenAI(model="gpt-4o", api_key=os.environ["OPENAI_API_KEY"])

2) Production secret is truncated or malformed

A copied key with whitespace, quotes, or line breaks will fail authentication.

# broken .env
OPENAI_API_KEY="sk-prod-abc123\n"

# fixed .env
OPENAI_API_KEY=sk-prod-abc123

In Python, strip defensive garbage if your secret source is messy:

api_key = os.getenv("OPENAI_API_KEY", "").strip()

3) You are mixing providers and keys

A valid Anthropic key won’t work with an OpenAI client. Same for Azure vs standard OpenAI.

# broken: Anthropic key passed into OpenAI client
from llama_index.llms.openai import OpenAI

llm = OpenAI(api_key=os.environ["ANTHROPIC_API_KEY"])

Use the matching integration:

from llama_index.llms.anthropic import Anthropic

llm = Anthropic(api_key=os.environ["ANTHROPIC_API_KEY"])

4) Your container or runtime doesn’t have the secret at all

This is common in Docker/Kubernetes/Cloud Run/Lambda. The app boots with no key and falls back to an empty string or local default.

# broken k8s snippet: secret never injected into container env
env:
  - name: OPENAI_API_KEY
    value: ""
# fixed k8s snippet
env:
  - name: OPENAI_API_KEY
    valueFrom:
      secretKeyRef:
        name: openai-secrets
        key: OPENAI_API_KEY

How to Debug It

  1. Print what the process actually sees Check whether the runtime has a non-empty value before LlamaIndex initializes anything.

    import os
    
    print(repr(os.getenv("OPENAI_API_KEY")))
    

    If this prints None, '', or a short truncated string, you found your issue.

  2. Confirm where initialization happens Search for module-level code like:

    llm = OpenAI(model="gpt-4o")
    

    If that runs at import time, move it into startup code or dependency injection.

  3. Check which class is failing Look at the stack trace. If you see:

    • llama_index.llms.openai.OpenAI
    • openai.AuthenticationError
    • ValueError: Invalid API key

    then this is a provider auth problem, not an indexing bug.

  4. Test outside LlamaIndex Verify the raw provider call with the same env var.

    from openai import OpenAI as RawOpenAI
    
    client = RawOpenAI(api_key=os.environ["OPENAI_API_KEY"])
    print(client.models.list())
    

    If this fails too, stop debugging LlamaIndex and fix credentials or deployment secrets first.

Prevention

  • Initialize provider clients after loading config, never at module import time.
  • Pass secrets explicitly in production instead of relying on implicit environment behavior.
  • Add a startup check that fails fast if required keys are missing:
    assert os.getenv("OPENAI_API_KEY"), "Missing OPENAI_API_KEY"
    

If you want fewer midnight incidents, treat API keys like any other production dependency: validate them early, inject them explicitly, and don’t assume your laptop environment matches production.


Keep learning

By Cyprian Aarons, AI Consultant at Topiax.

Want the complete 8-step roadmap?

Grab the free AI Agent Starter Kit — architecture templates, compliance checklists, and a 7-email deep-dive course.

Get the Starter Kit

Related Guides