How to Integrate AutoGen for insurance with Docker for RAG
AutoGen for insurance gives you the agent orchestration layer for policy, claims, and underwriting workflows. Docker gives you a repeatable runtime for retrieval pipelines, vector stores, and document processors. Put them together and you get a RAG system that can answer insurance questions from regulated documents without depending on a fragile local setup.
Prerequisites
- •Python 3.10+
- •Docker Desktop or Docker Engine running
- •An AutoGen for insurance project with API access configured
- •Access to your LLM provider keys
- •A vector store endpoint or local containerized store
- •Basic familiarity with:
- •
autogen_agentchat - •
dockerPython SDK - •RAG concepts: chunking, embedding, retrieval, generation
- •
Integration Steps
- •
Start the Docker-backed retrieval services
Run your document processor and vector store in containers. For a production setup, keep ingestion and query services separate so you can scale them independently.
import docker client = docker.from_env() # Example: start a Postgres container for metadata postgres = client.containers.run( "postgres:16", name="rag-postgres", detach=True, environment={ "POSTGRES_USER": "rag", "POSTGRES_PASSWORD": "ragpass", "POSTGRES_DB": "insurance_rag", }, ports={"5432/tcp": 5432}, remove=True, ) print(postgres.id) - •
Build a Dockerized ingestion service for policy documents
Your ingestion container should parse PDFs, split text into chunks, and write embeddings to your vector store. Keep the logic in a dedicated service so AutoGen only calls clean APIs.
import docker client = docker.from_env() container = client.containers.run( "python:3.11-slim", name="insurance-ingestor", command="python /app/ingest.py", volumes={ "/home/dev/insurance-rag/app": {"bind": "/app", "mode": "ro"} }, working_dir="/app", detach=True, environment={ "VECTOR_DB_URL": "http://host.docker.internal:8000", "DOCS_PATH": "/app/docs", }, remove=True, ) print(container.status) - •
Create an AutoGen assistant that calls the retrieval service
In AutoGen for insurance, define an assistant agent that handles user questions and delegates retrieval to your Dockerized service through a tool function.
import requests from autogen_agentchat.agents import AssistantAgent from autogen_ext.models.openai import OpenAIChatCompletionClient model_client = OpenAIChatCompletionClient( model="gpt-4o-mini", api_key="YOUR_API_KEY", ) def retrieve_policy_context(query: str) -> str: resp = requests.post( "http://localhost:8080/retrieve", json={"query": query, "top_k": 5}, timeout=30, ) resp.raise_for_status() return resp.json()["context"] assistant = AssistantAgent( name="insurance_rag_assistant", model_client=model_client, tools=[retrieve_policy_context], system_message=( "You answer insurance questions using retrieved policy context only. " "If context is missing, say what is missing." ), ) - •
Wire the agent to the Docker-hosted RAG API
Expose your retriever as an HTTP service inside Docker, then let AutoGen call it during agent runs. This keeps the agent stateless and makes the retrieval layer easy to swap out.
from fastapi import FastAPI from pydantic import BaseModel app = FastAPI() class QueryRequest(BaseModel): query: str top_k: int = 5 @app.post("/retrieve") def retrieve(req: QueryRequest): # Replace with actual vector DB search + reranking chunks = [ {"text": "Coverage includes accidental damage subject to exclusions."}, {"text": "Claims must be filed within 30 days of discovery."}, ] context = "\n\n".join(chunk["text"] for chunk in chunks[: req.top_k]) return {"context": context} - •
Run a full agent interaction
Use the agent to ask a question that requires policy grounding. The model should retrieve context from the Dockerized service before answering.
import asyncio async def main(): result = await assistant.run( task="Does this policy cover accidental damage on mobile devices?" ) print(result.messages[-1].content) asyncio.run(main())
Testing the Integration
Use one direct retrieval call and one agent run. That verifies both sides of the integration: Docker is serving RAG data, and AutoGen is consuming it.
import requests
import asyncio
# Test retrieval API directly
resp = requests.post(
"http://localhost:8080/retrieve",
json={"query": "accidental damage mobile devices", "top_k": 2},
timeout=30,
)
print(resp.json()["context"])
# Test agent path
async def test_agent():
result = await assistant.run(
task="What is the claim filing deadline?"
)
print(result.messages[-1].content)
asyncio.run(test_agent())
Expected output:
Coverage includes accidental damage subject to exclusions.
Claims must be filed within 30 days of discovery.
The exact wording will vary based on your indexed documents, but you should see grounded answers instead of generic model output.
Real-World Use Cases
- •Policy Q&A assistant
- •Let brokers or internal ops teams ask questions about coverage limits, exclusions, and endorsements.
- •Claims triage agent
- •Retrieve relevant policy clauses and claim history before routing or approving a claim.
- •Underwriting copilot
- •Pull risk guidelines, appetite rules, and prior submissions into one agent workflow for faster review.
Keep learning
- •The complete AI Agents Roadmap — my full 8-step breakdown
- •Free: The AI Agent Starter Kit — PDF checklist + starter code
- •Work with me — I build AI for banks and insurance companies
By Cyprian Aarons, AI Consultant at Topiax.
Want the complete 8-step roadmap?
Grab the free AI Agent Starter Kit — architecture templates, compliance checklists, and a 7-email deep-dive course.
Get the Starter Kit