How to Integrate OpenAI for wealth management with Pinecone for RAG

By Cyprian AaronsUpdated 2026-04-21
openai-for-wealth-managementpineconerag

OpenAI for wealth management gives you the reasoning layer for client-facing advice, portfolio summaries, and advisor copilots. Pinecone gives you the retrieval layer so those answers stay grounded in approved research, policy docs, market commentary, and client records.

Put them together and you get a RAG system that can answer wealth-management questions with context pulled from your own knowledge base instead of guessing.

Prerequisites

  • Python 3.10+
  • An OpenAI API key
  • A Pinecone API key
  • A Pinecone index created ahead of time
  • A document corpus for wealth management:
    • investment policy statements
    • product sheets
    • market commentary
    • compliance-approved FAQs
  • Installed packages:
    • openai
    • pinecone
    • python-dotenv
pip install openai pinecone python-dotenv

Integration Steps

  1. Set up environment variables

Keep secrets out of code. Use a .env file so your agent can run locally and in CI with the same config.

from dotenv import load_dotenv
import os

load_dotenv()

OPENAI_API_KEY = os.getenv("OPENAI_API_KEY")
PINECONE_API_KEY = os.getenv("PINECONE_API_KEY")
PINECONE_INDEX_NAME = os.getenv("PINECONE_INDEX_NAME", "wealth-rag")
  1. Initialize OpenAI and Pinecone clients

For generation, use the OpenAI Responses API. For retrieval, use Pinecone’s index query interface.

from openai import OpenAI
from pinecone import Pinecone

openai_client = OpenAI(api_key=OPENAI_API_KEY)
pc = Pinecone(api_key=PINECONE_API_KEY)

index = pc.Index(PINECONE_INDEX_NAME)
  1. Embed your wealth-management documents and upsert them into Pinecone

Use the same embedding model for both documents and user queries. That keeps vector similarity consistent.

from typing import List

docs = [
    {
        "id": "ips_001",
        "text": "This portfolio targets moderate growth with a 60/40 equity-fixed income allocation.",
        "metadata": {"source": "ips", "topic": "asset_allocation"}
    },
    {
        "id": "faq_014",
        "text": "Clients should rebalance when allocation drifts by more than 5 percentage points.",
        "metadata": {"source": "faq", "topic": "rebalancing"}
    }
]

texts: List[str] = [d["text"] for d in docs]

embeddings = openai_client.embeddings.create(
    model="text-embedding-3-small",
    input=texts
)

vectors = []
for doc, emb in zip(docs, embeddings.data):
    vectors.append({
        "id": doc["id"],
        "values": emb.embedding,
        "metadata": {**doc["metadata"], "text": doc["text"]}
    })

index.upsert(vectors=vectors)
  1. Query Pinecone, then pass retrieved context to OpenAI

This is the core RAG flow. Retrieve top matches from Pinecone, then instruct OpenAI to answer only from that context.

def retrieve_context(query: str, top_k: int = 3) -> str:
    q_emb = openai_client.embeddings.create(
        model="text-embedding-3-small",
        input=query
    )

    result = index.query(
        vector=q_emb.data[0].embedding,
        top_k=top_k,
        include_metadata=True
    )

    chunks = []
    for match in result.matches:
        md = match.metadata or {}
        chunks.append(f"[{md.get('source')}] {md.get('text', '')}")

    return "\n".join(chunks)

def answer_with_rag(query: str) -> str:
    context = retrieve_context(query)

    response = openai_client.responses.create(
        model="gpt-4o-mini",
        input=[
            {
                "role": "system",
                "content": (
                    "You are a wealth management assistant. "
                    "Answer using only the provided context. "
                    "If the context is insufficient, say so."
                )
            },
            {
                "role": "user",
                "content": f"Context:\n{context}\n\nQuestion: {query}"
            }
        ]
    )

    return response.output_text

print(answer_with_rag("When should a client rebalance their portfolio?"))
  1. Add basic guardrails for production use

Wealth workflows need predictable behavior. Keep answers grounded, log retrieval results, and reject low-confidence responses when context is weak.

def safe_answer(query: str) -> dict:
    context = retrieve_context(query)

    if not context.strip():
        return {
            "answer": None,
            "confidence": "low",
            "reason": "No supporting documents found in Pinecone."
        }

    response = openai_client.responses.create(
        model="gpt-4o-mini",
        input=f"Use only this context:\n{context}\n\nQuestion: {query}"
    )

    return {
        "answer": response.output_text,
        "confidence": "medium",
        "sources": context.split("\n")
    }

Testing the Integration

Run a simple query against your indexed content and check that the answer reflects retrieved documents.

test_query = "What is the rebalancing threshold?"
result = safe_answer(test_query)

print("ANSWER:", result["answer"])
print("CONFIDENCE:", result["confidence"])
print("SOURCES:", result["sources"])

Expected output:

ANSWER: Clients should rebalance when allocation drifts by more than 5 percentage points.
CONFIDENCE: medium
SOURCES: ['[faq] Clients should rebalance when allocation drifts by more than 5 percentage points.']

Real-World Use Cases

  • Advisor copilot

    • Let relationship managers ask questions like “What does this client’s IPS allow?” and get grounded answers from approved internal docs.
  • Client Q&A assistant

    • Build a secure chatbot that explains portfolio rules, risk bands, rebalancing triggers, and product constraints using your firm’s content.
  • Research summarization agent

    • Retrieve market notes from Pinecone and have OpenAI generate concise summaries for advisors before client calls.

Keep learning

By Cyprian Aarons, AI Consultant at Topiax.

Want the complete 8-step roadmap?

Grab the free AI Agent Starter Kit — architecture templates, compliance checklists, and a 7-email deep-dive course.

Get the Starter Kit

Related Guides