How to Integrate AutoGen for insurance with Docker for RAG

By Cyprian AaronsUpdated 2026-04-21

autogen-for-insurancedockerrag

AutoGen for insurance gives you the agent orchestration layer for policy, claims, and underwriting workflows. Docker gives you a repeatable runtime for retrieval pipelines, vector stores, and document processors. Put them together and you get a RAG system that can answer insurance questions from regulated documents without depending on a fragile local setup.

Prerequisites

•Python 3.10+
•Docker Desktop or Docker Engine running
•An AutoGen for insurance project with API access configured
•Access to your LLM provider keys
•A vector store endpoint or local containerized store
•
Basic familiarity with:
- •autogen_agentchat
- •docker Python SDK
- •RAG concepts: chunking, embedding, retrieval, generation

Integration Steps

•

Start the Docker-backed retrieval services

Run your document processor and vector store in containers. For a production setup, keep ingestion and query services separate so you can scale them independently.

import docker

client = docker.from_env()

# Example: start a Postgres container for metadata
postgres = client.containers.run(
    "postgres:16",
    name="rag-postgres",
    detach=True,
    environment={
        "POSTGRES_USER": "rag",
        "POSTGRES_PASSWORD": "ragpass",
        "POSTGRES_DB": "insurance_rag",
    },
    ports={"5432/tcp": 5432},
    remove=True,
)

print(postgres.id)

•

Build a Dockerized ingestion service for policy documents

Your ingestion container should parse PDFs, split text into chunks, and write embeddings to your vector store. Keep the logic in a dedicated service so AutoGen only calls clean APIs.

import docker

client = docker.from_env()

container = client.containers.run(
    "python:3.11-slim",
    name="insurance-ingestor",
    command="python /app/ingest.py",
    volumes={
        "/home/dev/insurance-rag/app": {"bind": "/app", "mode": "ro"}
    },
    working_dir="/app",
    detach=True,
    environment={
        "VECTOR_DB_URL": "http://host.docker.internal:8000",
        "DOCS_PATH": "/app/docs",
    },
    remove=True,
)

print(container.status)

•

Create an AutoGen assistant that calls the retrieval service

In AutoGen for insurance, define an assistant agent that handles user questions and delegates retrieval to your Dockerized service through a tool function.

import requests
from autogen_agentchat.agents import AssistantAgent
from autogen_ext.models.openai import OpenAIChatCompletionClient

model_client = OpenAIChatCompletionClient(
    model="gpt-4o-mini",
    api_key="YOUR_API_KEY",
)

def retrieve_policy_context(query: str) -> str:
    resp = requests.post(
        "http://localhost:8080/retrieve",
        json={"query": query, "top_k": 5},
        timeout=30,
    )
    resp.raise_for_status()
    return resp.json()["context"]

assistant = AssistantAgent(
    name="insurance_rag_assistant",
    model_client=model_client,
    tools=[retrieve_policy_context],
    system_message=(
        "You answer insurance questions using retrieved policy context only. "
        "If context is missing, say what is missing."
    ),
)

•

Wire the agent to the Docker-hosted RAG API

Expose your retriever as an HTTP service inside Docker, then let AutoGen call it during agent runs. This keeps the agent stateless and makes the retrieval layer easy to swap out.

from fastapi import FastAPI
from pydantic import BaseModel

app = FastAPI()

class QueryRequest(BaseModel):
    query: str
    top_k: int = 5

@app.post("/retrieve")
def retrieve(req: QueryRequest):
    # Replace with actual vector DB search + reranking
    chunks = [
        {"text": "Coverage includes accidental damage subject to exclusions."},
        {"text": "Claims must be filed within 30 days of discovery."},
    ]
    context = "\n\n".join(chunk["text"] for chunk in chunks[: req.top_k])
    return {"context": context}

•

Run a full agent interaction

Use the agent to ask a question that requires policy grounding. The model should retrieve context from the Dockerized service before answering.

import asyncio

async def main():
    result = await assistant.run(
        task="Does this policy cover accidental damage on mobile devices?"
    )
    print(result.messages[-1].content)

asyncio.run(main())

Testing the Integration

Use one direct retrieval call and one agent run. That verifies both sides of the integration: Docker is serving RAG data, and AutoGen is consuming it.

import requests
import asyncio

# Test retrieval API directly
resp = requests.post(
    "http://localhost:8080/retrieve",
    json={"query": "accidental damage mobile devices", "top_k": 2},
    timeout=30,
)
print(resp.json()["context"])

# Test agent path
async def test_agent():
    result = await assistant.run(
        task="What is the claim filing deadline?"
    )
    print(result.messages[-1].content)

asyncio.run(test_agent())

Expected output:

Coverage includes accidental damage subject to exclusions.

Claims must be filed within 30 days of discovery.

The exact wording will vary based on your indexed documents, but you should see grounded answers instead of generic model output.

Real-World Use Cases

•
Policy Q&A assistant
- •Let brokers or internal ops teams ask questions about coverage limits, exclusions, and endorsements.
•
Claims triage agent
- •Retrieve relevant policy clauses and claim history before routing or approving a claim.
•
Underwriting copilot
- •Pull risk guidelines, appetite rules, and prior submissions into one agent workflow for faster review.

Keep learning

•The complete AI Agents Roadmap — my full 8-step breakdown
•Free: The AI Agent Starter Kit — PDF checklist + starter code
•Work with me — I build AI for banks and insurance companies

By Cyprian Aarons, AI Consultant at Topiax.

ShareX / Twitter LinkedIn

Want the complete 8-step roadmap?

Grab the free AI Agent Starter Kit — architecture templates, compliance checklists, and a 7-email deep-dive course.

Get the Starter Kit