How to Integrate OpenAI for investment banking with Pinecone for production AI

By Cyprian AaronsUpdated 2026-04-21
openai-for-investment-bankingpineconeproduction-ai

Combining OpenAI for investment banking with Pinecone gives you a practical retrieval layer for deal docs, market research, and internal knowledge. The pattern is simple: OpenAI handles reasoning and generation, while Pinecone stores embeddings so your agent can pull the right context before answering.

For production AI in banking, that means fewer hallucinations, better auditability, and faster answers over large document sets like CIMs, pitch books, earnings transcripts, and policy memos.

Prerequisites

  • Python 3.10+
  • An OpenAI API key
  • A Pinecone API key
  • Access to a Pinecone project and index
  • pip installed
  • A local .env file or secrets manager for credentials
  • Basic familiarity with embeddings and vector search

Install the SDKs:

pip install openai pinecone python-dotenv

Set environment variables:

export OPENAI_API_KEY="your-openai-key"
export PINECONE_API_KEY="your-pinecone-key"
export PINECONE_INDEX_NAME="investment-banking-docs"

Integration Steps

  1. Initialize the OpenAI client and generate embeddings

Use OpenAI embeddings for chunked banking documents. In production, text-embedding-3-small is usually enough for retrieval; use text-embedding-3-large if you need more recall on dense financial language.

import os
from openai import OpenAI
from dotenv import load_dotenv

load_dotenv()

client = OpenAI(api_key=os.environ["OPENAI_API_KEY"])

def embed_text(text: str) -> list[float]:
    response = client.embeddings.create(
        model="text-embedding-3-small",
        input=text
    )
    return response.data[0].embedding

sample_chunk = """
Company overview: The issuer reported EBITDA growth of 18% YoY,
driven by higher recurring revenue and improved gross margin.
"""

vector = embed_text(sample_chunk)
print(len(vector))
  1. Create or connect to a Pinecone index

Pinecone stores the embedding vectors and metadata. For banking use cases, keep metadata like document type, ticker, sector, date, and source so your agent can filter precisely.

import os
from pinecone import Pinecone, ServerlessSpec

pc = Pinecone(api_key=os.environ["PINECONE_API_KEY"])
index_name = os.environ["PINECONE_INDEX_NAME"]

existing_indexes = [idx["name"] for idx in pc.list_indexes()]

if index_name not in existing_indexes:
    pc.create_index(
        name=index_name,
        dimension=1536,  # matches text-embedding-3-small
        metric="cosine",
        spec=ServerlessSpec(cloud="aws", region="us-east-1")
    )

index = pc.Index(index_name)
print(f"Connected to index: {index_name}")
  1. Upsert financial document chunks into Pinecone

Chunk documents before embedding. Don’t store whole PDFs as one vector; that kills retrieval quality. Each chunk should be small enough to represent one idea from a filing, memo, or transcript.

documents = [
    {
        "id": "deal-memo-001",
        "text": "The target company has recurring revenue of $42M and EBITDA margin of 24%.",
        "metadata": {
            "source": "deal_memo",
            "ticker": "ABC",
            "sector": "software",
            "date": "2025-01-15"
        }
    },
    {
        "id": "earnings-call-001",
        "text": "Management expects Q2 revenue growth to accelerate due to pipeline conversion.",
        "metadata": {
            "source": "earnings_call",
            "ticker": "ABC",
            "sector": "software",
            "date": "2025-02-10"
        }
    }
]

vectors = []
for doc in documents:
    embedding = embed_text(doc["text"])
    vectors.append({
        "id": doc["id"],
        "values": embedding,
        "metadata": {**doc["metadata"], "text": doc["text"]}
    })

index.upsert(vectors=vectors)
print("Upsert complete")
  1. Query Pinecone with a user question

At runtime, embed the user question with OpenAI, search Pinecone for relevant chunks, then pass those chunks back into OpenAI as context.

query = "What drove EBITDA improvement for ABC?"
query_embedding = embed_text(query)

results = index.query(
    vector=query_embedding,
    top_k=3,
    include_metadata=True,
    filter={"ticker": {"$eq": "ABC"}}
)

for match in results["matches"]:
    print(match["id"], match["score"], match["metadata"]["source"])
  1. Generate the final answer with OpenAI using retrieved context

This is the actual RAG loop. Keep the prompt grounded in retrieved evidence and force the model to answer only from context when used in regulated workflows.

context_blocks = []
for match in results["matches"]:
    md = match["metadata"]
    context_blocks.append(f"[{md['source']}] {md['text']}")

context = "\n\n".join(context_blocks)

response = client.responses.create(
    model="gpt-4.1-mini",
    input=f"""
You are an investment banking assistant.
Answer only using the provided context.

Context:
{context}

Question:
{query}
"""
)

print(response.output_text)

Testing the Integration

Run a full end-to-end check: embed a query, retrieve from Pinecone, and generate an answer from OpenAI.

test_query = "Why did ABC's margins improve?"
q_vec = embed_text(test_query)

search = index.query(
    vector=q_vec,
    top_k=2,
    include_metadata=True,
    filter={"ticker": {"$eq": "ABC"}}
)

ctx = "\n".join(
    m["metadata"]["text"] for m in search["matches"]
)

answer = client.responses.create(
    model="gpt-4.1-mini",
    input=f"Use only this context:\n{ctx}\n\nQuestion: {test_query}"
)

print(answer.output_text)

Expected output:

ABC's margins improved due to higher recurring revenue and better gross margin efficiency.

If you get no matches, check these first:

  • Your embedding dimension matches the Pinecone index dimension
  • The metadata filter keys exist on stored vectors
  • The same model is used for both indexing and querying embeddings
  • Your chunks are not too large

Real-World Use Cases

  • Deal team knowledge assistant

    • Search past CIMs, diligence notes, board decks, and comps quickly.
    • Answer questions like “Show me similar SaaS targets with >20% EBITDA margins.”
  • Earnings call analyst copilot

    • Index transcripts by ticker and quarter.
    • Summarize guidance changes or flag management tone shifts across quarters.
  • Compliance-aware internal research agent

    • Restrict retrieval by business unit or document class.
    • Keep answers grounded in approved source material for auditability.

For production AI in investment banking, this pattern is the baseline: OpenAI for reasoning and generation, Pinecone for retrieval at scale. Once that works reliably, you can add access control, citations, logging, evals, and human review without changing the core architecture.


Keep learning

By Cyprian Aarons, AI Consultant at Topiax.

Want the complete 8-step roadmap?

Grab the free AI Agent Starter Kit — architecture templates, compliance checklists, and a 7-email deep-dive course.

Get the Starter Kit

Related Guides