How to Fix 'authentication failed in production' in AutoGen (Python)
If you’re seeing authentication failed in production while using AutoGen with Python, the issue is usually not AutoGen itself. It’s almost always a bad credential path: wrong API key, missing environment variable, or a model client configured for the wrong provider.
This tends to show up when your local .env works, but the deployed app fails in Docker, on Azure, in CI, or behind a secrets manager. The stack trace often ends with something like openai.AuthenticationError: Error code: 401 or autogen_core._exceptions.AuthenticationError.
The Most Common Cause
The #1 cause is simple: your code reads the key from one place locally, but production does not have that same environment variable set.
In AutoGen, this usually happens when you create an OpenAIChatCompletionClient or similar client with api_key=os.getenv(...), then deploy without wiring that env var into the runtime.
Broken vs fixed
| Broken pattern | Fixed pattern |
|---|---|
| Reads a key that exists only on your laptop | Loads the key from the production secret store / env |
Silently passes None into the client | Fails fast if the key is missing |
Works in local .env, fails in container/server | Works consistently across environments |
# BROKEN
import os
from autogen_agentchat.agents import AssistantAgent
from autogen_ext.models.openai import OpenAIChatCompletionClient
model_client = OpenAIChatCompletionClient(
model="gpt-4o-mini",
api_key=os.getenv("OPENAI_API_KEY"), # None in prod if env isn't wired
)
agent = AssistantAgent(
name="support_agent",
model_client=model_client,
)
# FIXED
import os
from autogen_agentchat.agents import AssistantAgent
from autogen_ext.models.openai import OpenAIChatCompletionClient
api_key = os.environ["OPENAI_API_KEY"] # fail fast if missing
model_client = OpenAIChatCompletionClient(
model="gpt-4o-mini",
api_key=api_key,
)
agent = AssistantAgent(
name="support_agent",
model_client=model_client,
)
If you want this to be production-safe, validate startup before the agent runs:
required = ["OPENAI_API_KEY"]
missing = [k for k in required if not os.getenv(k)]
if missing:
raise RuntimeError(f"Missing required env vars: {missing}")
Other Possible Causes
1) Wrong provider class for the model endpoint
A common mistake is using OpenAI classes against Azure OpenAI, or vice versa. The auth can fail even though the key is valid because the endpoint expects different headers and config.
# WRONG: OpenAI client pointed at Azure endpoint
OpenAIChatCompletionClient(
model="gpt-4o-mini",
api_key=os.environ["AZURE_OPENAI_API_KEY"],
base_url="https://my-resource.openai.azure.com/",
)
Use the Azure-specific client/config instead:
# RIGHT: Azure client with Azure settings
from autogen_ext.models.azure import AzureOpenAIChatCompletionClient
client = AzureOpenAIChatCompletionClient(
azure_deployment="gpt-4o-mini",
api_version="2024-06-01",
azure_endpoint="https://my-resource.openai.azure.com/",
api_key=os.environ["AZURE_OPENAI_API_KEY"],
)
2) Environment variable exists locally but not in production
This shows up a lot in Docker and Kubernetes. Your .env file is not automatically available inside the container unless you pass it through.
# docker-compose.yml snippet
services:
app:
image: my-agent-app
environment:
OPENAI_API_KEY: ${OPENAI_API_KEY}
If you forget that mapping, AutoGen will initialize fine and then fail on first request with something like:
- •
openai.AuthenticationError: Error code: 401 - •
Authentication failed - •
No API key provided
3) Expired or rotated secret
If your org rotates secrets, your deployed service may still be holding an old key. This is common when keys are injected at build time instead of runtime.
# bad practice: baking secrets into image build args
docker build --build-arg OPENAI_API_KEY=...
Use runtime injection instead:
docker run -e OPENAI_API_KEY="$OPENAI_API_KEY" my-agent-app
4) Wrong model name or deployment name
A bad deployment name can surface as auth-like failure depending on provider and SDK version. This happens often with Azure because people confuse the deployment name with the base model name.
# WRONG if "gpt-4o" is not your Azure deployment name
AzureOpenAIChatCompletionClient(
azure_deployment="gpt-4o",
api_version="2024-06-01",
)
Make sure azure_deployment matches the exact deployment configured in Azure OpenAI Studio.
How to Debug It
- •
Print what AutoGen is actually using
- •Check whether your process sees the expected env vars.
- •Don’t print full secrets; just confirm presence.
import os print("OPENAI_API_KEY set:", bool(os.getenv("OPENAI_API_KEY"))) print("AZURE_OPENAI_API_KEY set:", bool(os.getenv("AZURE_OPENAI_API_KEY"))) - •
Identify which client class you instantiated
- •If you’re calling
OpenAIChatCompletionClient, use OpenAI credentials and endpoint style. - •If you’re calling
AzureOpenAIChatCompletionClient, use Azure endpoint + deployment.
- •If you’re calling
- •
Reproduce outside AutoGen
- •Make one raw SDK call with the same env vars.
- •If raw OpenAI/Azure call fails with
401, it’s not an AutoGen bug.
- •
Inspect the exact exception text
- •
openai.AuthenticationErrorusually means invalid/missing token. - •
autogen_core._exceptions.AuthenticationErroroften wraps provider-level auth failures. - •Endpoint mismatch often looks like auth failure but is really config drift.
- •
Prevention
- •Validate credentials at process startup, not after agent creation.
- •Keep provider-specific config separate:
- •OpenAI: API key + OpenAI client class
- •Azure OpenAI: endpoint + deployment + Azure client class
- •Treat secrets as runtime configuration only:
- •Kubernetes Secret, AWS Secrets Manager, Azure Key Vault, CI variables, not hardcoded values or build args
If you’re still stuck, compare local vs production config line by line. In AutoGen projects, “authentication failed” usually means “the runtime did not get the credential shape this client expects.”
Keep learning
- •The complete AI Agents Roadmap — my full 8-step breakdown
- •Free: The AI Agent Starter Kit — PDF checklist + starter code
- •Work with me — I build AI for banks and insurance companies
By Cyprian Aarons, AI Consultant at Topiax.
Want the complete 8-step roadmap?
Grab the free AI Agent Starter Kit — architecture templates, compliance checklists, and a 7-email deep-dive course.
Get the Starter Kit