How to Integrate OpenAI for pension funds with Pinecone for RAG

By Cyprian AaronsUpdated 2026-04-21
openai-for-pension-fundspineconerag

Why this integration matters

If you’re building an AI agent for a pension fund, the hard part is not generating text. It’s grounding answers in the right policy docs, scheme rules, member communications, and investment research without hallucinating. OpenAI gives you the reasoning and response layer; Pinecone gives you low-latency semantic retrieval over your indexed pension knowledge base.

That combination is what turns a generic chatbot into a usable RAG system for member servicing, internal ops, and compliance support.

Prerequisites

  • Python 3.10+
  • An OpenAI API key
  • A Pinecone API key and an existing Pinecone project
  • A Pinecone index created with the right dimension for your embedding model
  • Installed packages:
    • openai
    • pinecone
    • python-dotenv
  • A document set to index:
    • pension scheme rules
    • FAQ PDFs
    • contribution policy docs
    • retirement process guides

Install dependencies:

pip install openai pinecone python-dotenv

Set environment variables:

export OPENAI_API_KEY="your-openai-key"
export PINECONE_API_KEY="your-pinecone-key"
export PINECONE_INDEX_NAME="pension-rag-index"

Integration Steps

1) Initialize OpenAI and Pinecone clients

Use the current Python SDKs directly. Keep configuration in env vars so your agent runtime can be deployed cleanly across dev, staging, and production.

import os
from openai import OpenAI
from pinecone import Pinecone

openai_client = OpenAI(api_key=os.environ["OPENAI_API_KEY"])
pc = Pinecone(api_key=os.environ["PINECONE_API_KEY"])

index_name = os.environ["PINECONE_INDEX_NAME"]
index = pc.Index(index_name)

If you’re running this inside a service, load env vars from .env:

from dotenv import load_dotenv

load_dotenv()

2) Create embeddings with OpenAI and upsert them into Pinecone

For RAG, you need to chunk your pension documents before embedding. Keep chunks small enough to retrieve precisely, usually 200–500 tokens.

import uuid

docs = [
    {
        "id": "scheme-rule-001",
        "text": "Members may retire early from age 55 subject to actuarial reduction and trustee approval.",
        "metadata": {"source": "scheme_rules.pdf", "section": "early_retirement"}
    },
    {
        "id": "faq-014",
        "text": "Contribution changes must be submitted by the 10th of each month to take effect in the next payroll cycle.",
        "metadata": {"source": "member_faq.pdf", "section": "contributions"}
    }
]

texts = [d["text"] for d in docs]

embeddings_response = openai_client.embeddings.create(
    model="text-embedding-3-small",
    input=texts
)

vectors = []
for doc, emb in zip(docs, embeddings_response.data):
    vectors.append({
        "id": doc["id"],
        "values": emb.embedding,
        "metadata": {
            **doc["metadata"],
            "text": doc["text"]
        }
    })

index.upsert(vectors=vectors)

Make sure your Pinecone index dimension matches the embedding model output. For text-embedding-3-small, create the index with the correct vector size for that model in your account setup.

3) Retrieve relevant context from Pinecone for a user query

At query time, embed the question using the same OpenAI embedding model, then search Pinecone for top matches.

query = "Can a member retire at 54 under the pension scheme?"

query_embedding = openai_client.embeddings.create(
    model="text-embedding-3-small",
    input=query
).data[0].embedding

results = index.query(
    vector=query_embedding,
    top_k=3,
    include_metadata=True
)

matches = results["matches"]
for match in matches:
    print(match["score"], match["metadata"]["text"])

This is where RAG starts paying off. You are no longer asking the model to answer from memory; you are feeding it evidence from your own pension corpus.

4) Generate an answer with OpenAI using retrieved context

Now pass the retrieved chunks into a chat completion request. Keep the prompt tight and instruct the model not to invent policy details.

context_blocks = []
for m in matches:
    context_blocks.append(f"- {m['metadata']['text']}")

context = "\n".join(context_blocks)

messages = [
    {
        "role": "system",
        "content": (
            "You are a pension operations assistant. "
            "Answer only using the provided context. "
            "If the context is insufficient, say so clearly."
        )
    },
    {
        "role": "user",
        "content": f"""
Question: {query}

Context:
{context}

Answer in plain English and mention any uncertainty.
"""
    }
]

response = openai_client.chat.completions.create(
    model="gpt-4o-mini",
    messages=messages,
    temperature=0.2
)

print(response.choices[0].message.content)

For regulated workflows, keep temperature low and require citations back to retrieved snippets or document IDs.

5) Wrap retrieval + generation into one reusable function

In production, this should be a single service method your agent calls whenever it needs grounded answers.

def rag_answer(question: str) -> str:
    q_emb = openai_client.embeddings.create(
        model="text-embedding-3-small",
        input=question
    ).data[0].embedding

    res = index.query(vector=q_emb, top_k=3, include_metadata=True)
    context = "\n".join([f"- {m['metadata']['text']}" for m in res["matches"]])

    completion = openai_client.chat.completions.create(
        model="gpt-4o-mini",
        messages=[
            {
                "role": "system",
                "content": (
                    "You are a pension fund assistant. "
                    "Use only retrieved context. "
                    "Do not guess."
                )
            },
            {
                "role": "user",
                "content": f"Question: {question}\n\nContext:\n{context}"
            }
        ],
        temperature=0.1,
    )

    return completion.choices[0].message.content


print(rag_answer("What is the deadline for contribution changes?"))

Testing the Integration

Run a simple end-to-end check against one known policy question.

test_question = "What is the deadline for contribution changes?"

answer = rag_answer(test_question)
print("ANSWER:", answer)

Expected output:

ANSWER: Contribution changes must be submitted by the 10th of each month to take effect in the next payroll cycle.

If your output mentions an unrelated date or says it cannot find enough context when you know you indexed the FAQ, check these first:

  • The document chunk was actually upserted into Pinecone
  • The query embedding model matches the indexed vectors
  • Your prompt is not allowing free-form guessing
  • The metadata text field contains usable content

Real-World Use Cases

  • Member servicing assistant

    • Answer questions about retirement age, contribution deadlines, transfer rules, and benefit access using approved scheme documents.
  • Internal ops copilot

    • Help support teams find policy clauses fast when handling escalations or exceptions.
  • Compliance-aware knowledge search

    • Ground responses in source documents so reviewers can trace where an answer came from before it reaches members or advisors.

Keep learning

By Cyprian Aarons, AI Consultant at Topiax.

Want the complete 8-step roadmap?

Grab the free AI Agent Starter Kit — architecture templates, compliance checklists, and a 7-email deep-dive course.

Get the Starter Kit

Related Guides