How to Integrate OpenAI for banking with Pinecone for startups
Combining OpenAI for banking with Pinecone gives you a practical pattern for building AI agents that can answer customer questions, retrieve policy or account context, and stay grounded in your own data. For startups, this is the difference between a generic chatbot and a banking assistant that can pull the right knowledge at the right time without hallucinating.
Prerequisites
- •Python 3.10+
- •An OpenAI API key with access to the model you want to use
- •A Pinecone API key and an existing Pinecone index
- •A vector embedding model from OpenAI
- •A document set ready for ingestion:
- •FAQ pages
- •product terms
- •compliance notes
- •support runbooks
- •
pipinstalled
Install the SDKs:
pip install openai pinecone
Integration Steps
- •Initialize both clients
Start by wiring up the two SDKs in the same service. Keep secrets in environment variables so your agent can run in staging and production without code changes.
import os
from openai import OpenAI
from pinecone import Pinecone
openai_client = OpenAI(api_key=os.environ["OPENAI_API_KEY"])
pc = Pinecone(api_key=os.environ["PINECONE_API_KEY"])
index = pc.Index("banking-knowledge-base")
- •Embed banking content with OpenAI
Use OpenAI embeddings to turn banking documents into vectors. In practice, this is where you encode FAQ answers, product rules, or internal support docs before pushing them into Pinecone.
documents = [
{
"id": "doc_001",
"text": "Wire transfers submitted before 3 PM ET are processed same business day.",
"metadata": {"source": "banking_faq", "topic": "wire_transfers"}
},
{
"id": "doc_002",
"text": "Savings accounts require a minimum opening deposit of $100.",
"metadata": {"source": "product_terms", "topic": "savings_accounts"}
}
]
texts = [doc["text"] for doc in documents]
embedding_response = openai_client.embeddings.create(
model="text-embedding-3-small",
input=texts
)
vectors = []
for doc, emb in zip(documents, embedding_response.data):
vectors.append({
"id": doc["id"],
"values": emb.embedding,
"metadata": {
**doc["metadata"],
"text": doc["text"]
}
})
- •Upsert vectors into Pinecone
Once you have embeddings, store them in Pinecone with metadata attached. That metadata is what lets your agent explain where an answer came from.
index.upsert(vectors=vectors)
If you’re indexing at startup scale, batch this operation.
def batch_upsert(index, vectors, batch_size=100):
for i in range(0, len(vectors), batch_size):
index.upsert(vectors=vectors[i:i + batch_size])
batch_upsert(index, vectors)
- •Query Pinecone from an agent request
When a user asks a question, embed the query with the same model, then search Pinecone for relevant context. This is the retrieval step that grounds OpenAI responses in banking data.
query = "How fast are wire transfers processed?"
query_embedding = openai_client.embeddings.create(
model="text-embedding-3-small",
input=[query]
).data[0].embedding
results = index.query(
vector=query_embedding,
top_k=3,
include_metadata=True
)
matches = results["matches"]
for match in matches:
print(match["id"], match["score"], match["metadata"]["text"])
- •Generate the final answer with OpenAI
Take the retrieved context and feed it into the chat completion call. The model should answer only from retrieved banking knowledge, not from memory.
context_chunks = [
m["metadata"]["text"] for m in matches if m.get("metadata", {}).get("text")
]
prompt = f"""
You are a banking support assistant.
Answer using only the context below.
Context:
{chr(10).join(f"- {chunk}" for chunk in context_chunks)}
User question: {query}
"""
response = openai_client.chat.completions.create(
model="gpt-4o-mini",
messages=[
{"role": "system", "content": "You answer banking questions using provided context only."},
{"role": "user", "content": prompt}
],
temperature=0.2
)
print(response.choices[0].message.content)
Testing the Integration
Run a simple end-to-end test: embed a query, retrieve matching vectors from Pinecone, then generate an answer with OpenAI.
test_query = "What is the minimum opening deposit for savings accounts?"
test_embedding = openai_client.embeddings.create(
model="text-embedding-3-small",
input=[test_query]
).data[0].embedding
test_results = index.query(
vector=test_embedding,
top_k=1,
include_metadata=True
)
top_match_text = test_results["matches"][0]["metadata"]["text"]
answer = openai_client.chat.completions.create(
model="gpt-4o-mini",
messages=[
{"role": "system", "content": "Answer only from retrieved banking context."},
{"role": "user", "content": f"Context: {top_match_text}\n\nQuestion: {test_query}"}
]
)
print(answer.choices[0].message.content)
Expected output:
The minimum opening deposit for savings accounts is $100.
Real-World Use Cases
- •
Banking support agents
- •Answer product and policy questions using your own indexed documentation instead of static scripts.
- •
Compliance-aware copilots
- •Retrieve approved procedures and generate responses that stay within bank policy boundaries.
- •
Customer onboarding assistants
- •Guide users through account setup, required documents, and eligibility rules with grounded answers.
This pattern scales well for startups because it separates responsibilities cleanly: OpenAI handles language generation and embeddings, while Pinecone handles fast semantic retrieval over your banking knowledge base. Once that foundation is stable, you can add auth checks, PII redaction, audit logging, and tool routing without changing the core retrieval flow.
Keep learning
- •The complete AI Agents Roadmap — my full 8-step breakdown
- •Free: The AI Agent Starter Kit — PDF checklist + starter code
- •Work with me — I build AI for banks and insurance companies
By Cyprian Aarons, AI Consultant at Topiax.
Want the complete 8-step roadmap?
Grab the free AI Agent Starter Kit — architecture templates, compliance checklists, and a 7-email deep-dive course.
Get the Starter Kit