How to Integrate OpenAI for banking with Pinecone for RAG
Combining OpenAI for banking with Pinecone gives you a practical RAG stack for regulated workflows: the model handles reasoning and response generation, while Pinecone stores the bank’s policy docs, product guides, KYC procedures, and support knowledge in a retrievable format. That means your agent can answer from approved internal sources instead of guessing, which is the difference between a useful assistant and a compliance risk.
Prerequisites
- •Python 3.10+
- •An OpenAI for banking API key
- •A Pinecone API key
- •A Pinecone index created with the right dimension for your embedding model
- •Internal documents you want to retrieve from:
- •PDF policy docs
- •FAQ pages
- •onboarding playbooks
- •product manuals
- •Installed packages:
- •
openai - •
pinecone - •
python-dotenv
- •
pip install openai pinecone python-dotenv
Integration Steps
- •Set up your environment variables.
Keep secrets out of source control. Use a .env file or your secret manager.
import os
from dotenv import load_dotenv
load_dotenv()
OPENAI_API_KEY = os.getenv("OPENAI_API_KEY")
PINECONE_API_KEY = os.getenv("PINECONE_API_KEY")
PINECONE_INDEX_NAME = os.getenv("PINECONE_INDEX_NAME")
- •Initialize the OpenAI client and Pinecone client.
For OpenAI, use the official client and create embeddings with client.embeddings.create(). For Pinecone, initialize the client and connect to your existing index.
from openai import OpenAI
from pinecone import Pinecone
openai_client = OpenAI(api_key=OPENAI_API_KEY)
pc = Pinecone(api_key=PINECONE_API_KEY)
index = pc.Index(PINECONE_INDEX_NAME)
- •Chunk documents and generate embeddings.
RAG only works if retrieval is clean. Split documents into small chunks, then embed each chunk before storing it in Pinecone.
def chunk_text(text: str, chunk_size: int = 500, overlap: int = 50):
chunks = []
start = 0
while start < len(text):
end = min(start + chunk_size, len(text))
chunks.append(text[start:end])
start += chunk_size - overlap
return chunks
document_id = "banking_policy_001"
text = """
Customers can reset their password after verifying identity through MFA.
For account disputes above $5000, escalate to the fraud review queue.
"""
chunks = chunk_text(text)
embeddings_response = openai_client.embeddings.create(
model="text-embedding-3-small",
input=chunks,
)
vectors = []
for i, item in enumerate(embeddings_response.data):
vectors.append({
"id": f"{document_id}-chunk-{i}",
"values": item.embedding,
"metadata": {
"doc_id": document_id,
"chunk_index": i,
"text": chunks[i],
"source": "internal_policy"
}
})
index.upsert(vectors=vectors)
- •Retrieve relevant context from Pinecone for a user query.
When a user asks a question, embed the query with the same embedding model and run a similarity search against Pinecone.
query = "How do we handle disputes over $5000?"
query_embedding = openai_client.embeddings.create(
model="text-embedding-3-small",
input=[query],
).data[0].embedding
results = index.query(
vector=query_embedding,
top_k=3,
include_metadata=True
)
contexts = []
for match in results["matches"]:
contexts.append(match["metadata"]["text"])
context_block = "\n\n".join(contexts)
print(context_block)
- •Send retrieved context to OpenAI for grounded generation.
This is where the agent becomes useful. Pass the retrieved policy text into the chat completion request and force the model to answer only from that context.
system_prompt = (
"You are a banking assistant. Answer only using the provided context. "
"If the context does not contain the answer, say you don't have enough information."
)
user_prompt = f"""
Context:
{context_block}
Question:
{query}
"""
response = openai_client.chat.completions.create(
model="gpt-4o-mini",
messages=[
{"role": "system", "content": system_prompt},
{"role": "user", "content": user_prompt},
],
temperature=0.1,
)
print(response.choices[0].message.content)
Testing the Integration
Run an end-to-end test with a known policy question. You want to verify three things:
- •The query embedding call succeeds
- •Pinecone returns relevant chunks
- •OpenAI generates an answer grounded in those chunks
test_query = "What happens when a dispute is above $5000?"
test_embedding = openai_client.embeddings.create(
model="text-embedding-3-small",
input=[test_query],
).data[0].embedding
test_results = index.query(
vector=test_embedding,
top_k=2,
include_metadata=True
)
test_contexts = [m["metadata"]["text"] for m in test_results["matches"]]
test_context_block = "\n".join(test_contexts)
test_response = openai_client.chat.completions.create(
model="gpt-4o-mini",
messages=[
{"role": "system", "content": "Answer only from context."},
{"role": "user", "content": f"Context:\n{test_context_block}\n\nQuestion:\n{test_query}"},
],
)
print(test_response.choices[0].message.content)
Expected output:
Disputes above $5000 should be escalated to the fraud review queue.
If you get an answer that mentions unsupported details, tighten your prompt and reduce temperature. If retrieval is off, fix chunking or re-check your index dimension against the embedding model output size.
Real-World Use Cases
- •
Policy assistant for operations teams
- •Staff ask questions about card disputes, AML escalation paths, loan servicing rules, or KYC checks.
- •The agent retrieves approved policy text from Pinecone and answers with OpenAI.
- •
Customer support copilot
- •Agents get suggested responses based on product docs and internal runbooks.
- •This reduces handle time and keeps answers aligned with current bank policy.
- •
Compliance knowledge search
- •Analysts query procedures across multiple document sets.
- •Pinecone handles semantic retrieval; OpenAI turns results into readable summaries or action steps.
Keep learning
- •The complete AI Agents Roadmap — my full 8-step breakdown
- •Free: The AI Agent Starter Kit — PDF checklist + starter code
- •Work with me — I build AI for banks and insurance companies
By Cyprian Aarons, AI Consultant at Topiax.
Want the complete 8-step roadmap?
Grab the free AI Agent Starter Kit — architecture templates, compliance checklists, and a 7-email deep-dive course.
Get the Starter Kit