How to Integrate OpenAI for retail banking with Pinecone for production AI
Combining OpenAI for retail banking with Pinecone gives you a practical retrieval layer for customer-facing and internal banking agents. You get natural language generation from OpenAI and low-latency semantic search from Pinecone, which is what you need for things like policy Q&A, product recommendation, dispute triage, and relationship-manager copilots.
Prerequisites
- •Python 3.10+
- •An OpenAI API key with access to the models you plan to use
- •A Pinecone account and API key
- •A Pinecone index created with the correct vector dimension for your embedding model
- •
pipinstalled - •A source of retail banking documents:
- •product FAQs
- •fee schedules
- •compliance playbooks
- •loan policy docs
- •customer support macros
Install the SDKs:
pip install openai pinecone tiktoken python-dotenv
Set environment variables:
export OPENAI_API_KEY="your-openai-key"
export PINECONE_API_KEY="your-pinecone-key"
export PINECONE_INDEX_NAME="retail-banking-index"
Integration Steps
- •Initialize both clients
Start by wiring up the OpenAI and Pinecone clients in the same service. Keep this in a dedicated module so your agent layer can reuse it.
import os
from openai import OpenAI
from pinecone import Pinecone
openai_client = OpenAI(api_key=os.environ["OPENAI_API_KEY"])
pc = Pinecone(api_key=os.environ["PINECONE_API_KEY"])
index_name = os.environ["PINECONE_INDEX_NAME"]
index = pc.Index(index_name)
- •Create embeddings for banking content
Use OpenAI embeddings to turn policy text into vectors. For production, chunk your documents before embedding them. Keep chunks small enough to preserve retrieval quality.
banking_docs = [
{
"id": "fee_schedule_001",
"text": "Wire transfers over $5,000 incur a $25 outgoing fee for standard accounts."
},
{
"id": "mortgage_policy_014",
"text": "Mortgage pre-approval requires two recent pay stubs, bank statements, and a credit report."
},
]
embedding_response = openai_client.embeddings.create(
model="text-embedding-3-small",
input=[doc["text"] for doc in banking_docs]
)
vectors = []
for doc, item in zip(banking_docs, embedding_response.data):
vectors.append({
"id": doc["id"],
"values": item.embedding,
"metadata": {
"text": doc["text"],
"source": "retail_banking_kb"
}
})
- •Upsert vectors into Pinecone
Push the embedded chunks into your index. Use metadata aggressively; it makes filtering easier later when you need to separate product lines or compliance domains.
upsert_result = index.upsert(vectors=vectors)
print(upsert_result)
A typical production pattern is to include fields like:
- •
product_line: checking, savings, mortgage, cards - •
jurisdiction: US, UK, EU - •
doc_type: faq, policy, procedure - •
effective_date: version control for compliance
- •Retrieve relevant context at query time
When a user asks a question, embed the query with the same model and search Pinecone for top matches. This is the retrieval step that grounds OpenAI responses in bank-approved content.
query = "What do I need for mortgage pre-approval?"
query_embedding = openai_client.embeddings.create(
model="text-embedding-3-small",
input=query
).data[0].embedding
search_results = index.query(
vector=query_embedding,
top_k=3,
include_metadata=True
)
for match in search_results.matches:
print(match.id, match.score, match.metadata["text"])
- •Generate the final answer with OpenAI
Pass retrieved context into a chat completion call. In retail banking, this is where you enforce tone, disclaimers, and grounded answers.
context_snippets = "\n\n".join(
f"- {match.metadata['text']}"
for match in search_results.matches
)
messages = [
{
"role": "system",
"content": (
"You are a retail banking assistant. Answer only using the provided context. "
"If the answer is not in context, say you don't have enough information."
)
},
{
"role": "user",
"content": f"Question: {query}\n\nContext:\n{context_snippets}"
}
]
response = openai_client.chat.completions.create(
model="gpt-4o-mini",
messages=messages,
temperature=0.2
)
print(response.choices[0].message.content)
Testing the Integration
Run an end-to-end test that embeds a query, retrieves from Pinecone, and generates an answer.
def answer_banking_question(question: str) -> str:
q_embed = openai_client.embeddings.create(
model="text-embedding-3-small",
input=question
).data[0].embedding
results = index.query(
vector=q_embed,
top_k=3,
include_metadata=True
)
context = "\n\n".join(
m.metadata["text"] for m in results.matches if m.metadata and "text" in m.metadata
)
resp = openai_client.chat.completions.create(
model="gpt-4o-mini",
messages=[
{"role": "system", "content": "Answer only from context."},
{"role": "user", "content": f"{question}\n\nContext:\n{context}"}
],
temperature=0.1,
)
return resp.choices[0].message.content
print(answer_banking_question("What do I need for mortgage pre-approval?"))
Expected output:
Mortgage pre-approval requires two recent pay stubs, bank statements, and a credit report.
If retrieval is working correctly, your answer should closely reflect one of the stored policy snippets instead of hallucinating extra requirements.
Real-World Use Cases
- •Retail banking support agent
- •Answer questions about fees, card replacement timelines, wire limits, overdraft rules, and loan requirements using approved internal docs.
- •RM copilot
- •Help relationship managers find relevant product details before customer calls by searching across policies and playbooks.
- •Compliance-aware FAQ bot
- •Ground answers in versioned policy content so customer-facing responses stay aligned with current disclosures and procedures.
Keep learning
- •The complete AI Agents Roadmap — my full 8-step breakdown
- •Free: The AI Agent Starter Kit — PDF checklist + starter code
- •Work with me — I build AI for banks and insurance companies
By Cyprian Aarons, AI Consultant at Topiax.
Want the complete 8-step roadmap?
Grab the free AI Agent Starter Kit — architecture templates, compliance checklists, and a 7-email deep-dive course.
Get the Starter Kit