How to Integrate OpenAI for wealth management with Pinecone for startups

By Cyprian AaronsUpdated 2026-04-21

openai-for-wealth-managementpineconestartups

Combining OpenAI for wealth management with Pinecone gives you a practical pattern for building financial copilots that can answer client questions from your own indexed knowledge, not just model memory. For startups, that means you can ship portfolio Q&A, policy lookup, investment note retrieval, and advisor-assist workflows without stuffing everything into prompts.

The useful part is this: OpenAI handles reasoning and response generation, while Pinecone stores the firm’s documents, research notes, and client-specific context as embeddings. That gives you a retrieval-augmented system that stays grounded in your data.

Prerequisites

•Python 3.10+
•An OpenAI API key
•A Pinecone API key
•A Pinecone index created in advance
•pip installed
•Basic familiarity with embeddings and vector search
•
Access to the documents you want to index:
- •investment policy statements
- •product FAQs
- •market commentary
- •client onboarding notes

Install the SDKs:

pip install openai pinecone

Set environment variables:

export OPENAI_API_KEY="your-openai-key"
export PINECONE_API_KEY="your-pinecone-key"

Integration Steps

1) Initialize OpenAI and Pinecone clients

Start by creating both clients in the same service layer. Keep them separate; OpenAI generates embeddings and answers, Pinecone handles retrieval.

import os
from openai import OpenAI
from pinecone import Pinecone

openai_client = OpenAI(api_key=os.environ["OPENAI_API_KEY"])
pinecone_client = Pinecone(api_key=os.environ["PINECONE_API_KEY"])

index_name = "wealth-management-kb"
index = pinecone_client.Index(index_name)

If you are using Pinecone Serverless, create the index once from a setup script or dashboard. In production, do not create indexes on app startup.

2) Convert wealth management content into embeddings

Use OpenAI’s embeddings endpoint to turn documents into vectors. For wealth management, chunk by topic: allocation strategy, fees, risk disclosures, tax notes, and account rules.

documents = [
    {
        "id": "doc-001",
        "text": "Our balanced portfolio targets 60/40 allocation with quarterly rebalancing.",
        "metadata": {"source": "IPS", "topic": "allocation"}
    },
    {
        "id": "doc-002",
        "text": "Advisory fees are charged at 0.75% annually on assets under management.",
        "metadata": {"source": "fee_schedule", "topic": "fees"}
    }
]

texts = [d["text"] for d in documents]

embeddings_response = openai_client.embeddings.create(
    model="text-embedding-3-small",
    input=texts
)

vectors = []
for doc, emb in zip(documents, embeddings_response.data):
    vectors.append({
        "id": doc["id"],
        "values": emb.embedding,
        "metadata": {
            **doc["metadata"],
            "text": doc["text"]
        }
    })

For startup systems, keep metadata rich. You will want filters like topic=fees or source=IPS later when you retrieve context.

3) Upsert vectors into Pinecone

Push the embeddings into your index so they can be searched by semantic similarity.

upsert_result = index.upsert(vectors=vectors)

print(upsert_result)

A clean upsert flow should be idempotent. If a document changes, overwrite the same vector ID rather than creating duplicates.

4) Query Pinecone with a user question

When a user asks something like “What is the fee on advisory accounts?” embed the question using the same OpenAI embedding model, then query Pinecone for matching context.

query_text = "What advisory fee do we charge?"

query_embedding = openai_client.embeddings.create(
    model="text-embedding-3-small",
    input=[query_text]
).data[0].embedding

search_results = index.query(
    vector=query_embedding,
    top_k=3,
    include_metadata=True
)

matches = search_results["matches"]
context_chunks = [m["metadata"]["text"] for m in matches]
context = "\n\n".join(context_chunks)

print(context)

This is where Pinecone earns its keep. You are not asking the model to guess; you are retrieving the most relevant internal content first.

5) Generate the final answer with OpenAI

Now feed the retrieved context into OpenAI’s chat completions API and ask it to answer only from that material. This is the core RAG pattern for wealth management assistants.

messages = [
    {
        "role": "system",
        "content": (
            "You are a wealth management assistant. "
            "Answer only using the provided context. "
            "If the context does not contain the answer, say you do not have enough information."
        )
    },
    {
        "role": "user",
        "content": f"Context:\n{context}\n\nQuestion: {query_text}"
    }
]

response = openai_client.chat.completions.create(
    model="gpt-4o-mini",
    messages=messages,
    temperature=0.2
)

answer = response.choices[0].message.content
print(answer)

For regulated environments, keep temperature low and enforce grounded responses. If needed, add citations by returning document IDs from Pinecone alongside the answer.

Testing the Integration

Run an end-to-end test with a known question and verify that retrieved context influences the answer.

test_question = "How much do we charge in advisory fees?"

test_embedding = openai_client.embeddings.create(
    model="text-embedding-3-small",
    input=[test_question]
).data[0].embedding

test_matches = index.query(
    vector=test_embedding,
    top_k=1,
    include_metadata=True
)["matches"]

test_context = test_matches[0]["metadata"]["text"]

test_response = openai_client.chat.completions.create(
    model="gpt-4o-mini",
    messages=[
        {"role": "system", "content": "Answer only from context."},
        {"role": "user", "content": f"Context: {test_context}\nQuestion: {test_question}"}
    ],
)

print("Retrieved:", test_context)
print("Answer:", test_response.choices[0].message.content)

Expected output:

Retrieved: Advisory fees are charged at 0.75% annually on assets under management.
Answer: Advisory fees are charged at 0.75% annually on assets under management.

If your answer drifts beyond the retrieved text, tighten your system prompt or reduce temperature further.

Real-World Use Cases

•
Advisor assistant
- •Answer client questions about portfolios, fees, rebalancing rules, and account policies using internal documents.
•
Research copilot
- •Index market commentary and house views so advisors can query prior research before sending client updates.
•
Client onboarding bot
- •Retrieve KYC requirements, suitability rules, and product constraints during onboarding flows.

If you are building this for a startup, keep one rule: OpenAI reasons over retrieved context; Pinecone decides what context gets seen. That separation keeps your agent more reliable and easier to audit.

Keep learning

•The complete AI Agents Roadmap — my full 8-step breakdown
•Free: The AI Agent Starter Kit — PDF checklist + starter code
•Work with me — I build AI for banks and insurance companies

By Cyprian Aarons, AI Consultant at Topiax.

ShareX / Twitter LinkedIn

Want the complete 8-step roadmap?

Grab the free AI Agent Starter Kit — architecture templates, compliance checklists, and a 7-email deep-dive course.

Get the Starter Kit