How to Integrate OpenAI for investment banking with Pinecone for RAG
Combining OpenAI with Pinecone gives you a clean RAG stack for investment banking workflows: retrieve the right deal docs, filings, or research notes from Pinecone, then have OpenAI turn that context into an answer, summary, or draft memo. The practical win is simple: bankers and analysts get faster answers grounded in internal data instead of model guesses.
Prerequisites
- •Python 3.10+
- •An OpenAI API key
- •A Pinecone API key and an existing Pinecone index
- •A vector embedding model choice, such as
text-embedding-3-small - •Installed packages:
- •
openai - •
pinecone - •
python-dotenv
- •
- •A document corpus to index:
- •pitch decks
- •CIMs
- •earnings call transcripts
- •internal research notes
Install the SDKs:
pip install openai pinecone python-dotenv
Set environment variables:
export OPENAI_API_KEY="your-openai-key"
export PINECONE_API_KEY="your-pinecone-key"
export PINECONE_INDEX_NAME="ib-rag"
Integration Steps
- •Initialize both clients
Use the current OpenAI and Pinecone SDKs directly. Keep client setup in one module so your agent can reuse it across ingestion and query paths.
import os
from openai import OpenAI
from pinecone import Pinecone
openai_client = OpenAI(api_key=os.environ["OPENAI_API_KEY"])
pc = Pinecone(api_key=os.environ["PINECONE_API_KEY"])
index_name = os.environ["PINECONE_INDEX_NAME"]
index = pc.Index(index_name)
- •Create embeddings for your banking documents
For RAG, you need to convert each chunk into a vector before storing it in Pinecone. In investment banking, chunk by section boundaries where possible: company overview, financials, comps, risks, and transaction rationale.
def embed_text(text: str) -> list[float]:
response = openai_client.embeddings.create(
model="text-embedding-3-small",
input=text,
)
return response.data[0].embedding
sample_chunk = """
Company: Acme Corp
Section: Financial Highlights
Revenue grew 18% YoY to $420M.
EBITDA margin expanded to 24%.
"""
vector = embed_text(sample_chunk)
print(len(vector))
- •Upsert vectors into Pinecone with metadata
Store the raw text or a reference ID in metadata so you can reconstruct the prompt later. For banking use cases, metadata like ticker, sector, doc_type, date, and deal_id matters more than generic tags.
doc_id = "acme_2024_q2_financial_highlights"
metadata = {
"ticker": "ACME",
"doc_type": "earnings_note",
"date": "2024-08-01",
"source": "internal_research",
"text": sample_chunk,
}
index.upsert(
vectors=[
{
"id": doc_id,
"values": vector,
"metadata": metadata,
}
]
)
print("Upsert complete")
- •Query Pinecone with a user question
At runtime, embed the question, search Pinecone for the closest chunks, then pass those chunks into OpenAI as context. This is the core RAG loop.
def retrieve_context(query: str, top_k: int = 3) -> str:
query_vector = embed_text(query)
results = index.query(
vector=query_vector,
top_k=top_k,
include_metadata=True,
)
chunks = []
for match in results.matches:
md = match.metadata or {}
chunks.append(
f"[{md.get('ticker', 'N/A')} | {md.get('doc_type', 'N/A')} | {md.get('date', 'N/A')}]\n{md.get('text', '')}"
)
return "\n\n".join(chunks)
question = "What are the key financial highlights for Acme Corp?"
context = retrieve_context(question)
print(context[:1000])
- •Generate the final answer with OpenAI
Now feed the retrieved context into a chat completion request. For banking workflows, keep the prompt strict: answer only from provided context and flag missing information instead of inventing it.
def answer_with_rag(question: str) -> str:
context = retrieve_context(question)
response = openai_client.chat.completions.create(
model="gpt-4o-mini",
messages=[
{
"role": "system",
"content": (
"You are an investment banking assistant. "
"Use only the provided context. "
"If the context is insufficient, say so clearly."
),
},
{
"role": "user",
"content": f"Context:\n{context}\n\nQuestion:\n{question}",
},
],
temperature=0.1,
)
return response.choices[0].message.content
print(answer_with_rag("Summarize Acme Corp's financial highlights in two bullets."))
Testing the Integration
Run a simple end-to-end check after indexing at least one document chunk.
test_question = "What revenue growth did Acme Corp report?"
result = answer_with_rag(test_question)
print("QUESTION:", test_question)
print("ANSWER:", result)
Expected output:
QUESTION: What revenue growth did Acme Corp report?
ANSWER: Acme Corp reported 18% YoY revenue growth to $420M based on the indexed financial highlights.
If you get an empty or vague answer, check these first:
- •The document was actually upserted into Pinecone
- •Your index dimension matches your embedding model output
- •Metadata includes the text needed to build context
- •Retrieval top-k is high enough to surface relevant chunks
Real-World Use Cases
- •
Deal desk Q&A assistant
- •Ask questions over CIMs, management presentations, and diligence notes without manually searching PDFs.
- •
Earnings summary agent
- •Retrieve recent transcripts and internal research notes, then generate a concise banker-ready summary.
- •
Comparable company analysis helper
- •Store comps tables and analyst notes in Pinecone, then have OpenAI draft valuation commentary grounded in your internal data.
Keep learning
- •The complete AI Agents Roadmap — my full 8-step breakdown
- •Free: The AI Agent Starter Kit — PDF checklist + starter code
- •Work with me — I build AI for banks and insurance companies
By Cyprian Aarons, AI Consultant at Topiax.
Want the complete 8-step roadmap?
Grab the free AI Agent Starter Kit — architecture templates, compliance checklists, and a 7-email deep-dive course.
Get the Starter Kit