How to Integrate OpenAI for retail banking with Pinecone for startups

By Cyprian AaronsUpdated 2026-04-21

openai-for-retail-bankingpineconestartups

Combining OpenAI for retail banking with Pinecone gives you a practical pattern for building bank-grade AI agents that can answer customer questions, retrieve policy documents, and ground responses in approved internal knowledge. For startups, this is the fastest path to a support assistant or advisor bot that is useful without hallucinating against product terms, fee schedules, or compliance rules.

Prerequisites

•Python 3.10+
•An OpenAI API key
•A Pinecone API key
•A Pinecone index created with the right dimension for your embedding model
•pip installed
•
Basic access to your retail banking knowledge base:
- •FAQs
- •product sheets
- •policy PDFs converted to text
- •support macros or internal runbooks

Install the SDKs:

pip install openai pinecone tiktoken python-dotenv

Set environment variables:

export OPENAI_API_KEY="your-openai-key"
export PINECONE_API_KEY="your-pinecone-key"
export PINECONE_INDEX_NAME="banking-kb"

Integration Steps

•Initialize both clients

Use the OpenAI Python SDK for embeddings and chat responses, and Pinecone for vector storage and retrieval.

import os
from openai import OpenAI
from pinecone import Pinecone

openai_client = OpenAI(api_key=os.environ["OPENAI_API_KEY"])
pc = Pinecone(api_key=os.environ["PINECONE_API_KEY"])

index_name = os.environ["PINECONE_INDEX_NAME"]
index = pc.Index(index_name)

•Create embeddings for banking documents

Take your retail banking content, split it into chunks, and embed each chunk with OpenAI. Use text-embedding-3-small unless you have a reason to pay more for larger vectors.

documents = [
    {
        "id": "fee_001",
        "text": "Monthly maintenance fee is waived if the customer maintains a minimum balance of $1,500.",
        "metadata": {"source": "fees.pdf", "type": "account_fee"}
    },
    {
        "id": "card_002",
        "text": "Debit card replacement takes 3 to 5 business days. Expedited shipping is available for $25.",
        "metadata": {"source": "cards.pdf", "type": "debit_card"}
    }
]

texts = [doc["text"] for doc in documents]

embeddings_response = openai_client.embeddings.create(
    model="text-embedding-3-small",
    input=texts
)

vectors = []
for doc, emb in zip(documents, embeddings_response.data):
    vectors.append({
        "id": doc["id"],
        "values": emb.embedding,
        "metadata": {
            **doc["metadata"],
            "text": doc["text"]
        }
    })

•Upsert vectors into Pinecone

Store the embeddings so your agent can retrieve relevant banking context later.

index.upsert(vectors=vectors)
print(f"Upserted {len(vectors)} vectors into {index_name}")

•Retrieve relevant context for a user question

When a customer asks something like “How do I avoid monthly fees?”, embed the query and search Pinecone first.

query = "How can I avoid monthly account fees?"
query_embedding = openai_client.embeddings.create(
    model="text-embedding-3-small",
    input=[query]
).data[0].embedding

results = index.query(
    vector=query_embedding,
    top_k=3,
    include_metadata=True
)

contexts = []
for match in results.matches:
    contexts.append(match.metadata["text"])

print(contexts)

•Generate the final answer with OpenAI using retrieved context

Now pass only the retrieved bank-approved context into the model. This is the core RAG pattern that keeps answers grounded.

context_block = "\n\n".join([f"- {c}" for c in contexts])

messages = [
    {
        "role": "system",
        "content": (
            "You are a retail banking assistant. "
            "Answer only using the provided context. "
            "If the context does not contain the answer, say you do not know."
        )
    },
    {
        "role": "user",
        "content": f"Context:\n{context_block}\n\nQuestion: {query}"
    }
]

response = openai_client.chat.completions.create(
    model="gpt-4o-mini",
    messages=messages,
    temperature=0.2
)

print(response.choices[0].message.content)

Testing the Integration

Run a full round trip: embed a query, retrieve from Pinecone, then generate an answer with OpenAI.

test_query = "What balance do I need to waive the monthly fee?"

query_embedding = openai_client.embeddings.create(
    model="text-embedding-3-small",
    input=[test_query]
).data[0].embedding

matches = index.query(
    vector=query_embedding,
    top_k=1,
    include_metadata=True
)

retrieved_text = matches.matches[0].metadata["text"]

answer = openai_client.chat.completions.create(
    model="gpt-4o-mini",
    messages=[
        {"role": "system", "content": "Answer only from context."},
        {"role": "user", "content": f"Context: {retrieved_text}\n\nQuestion: {test_query}"}
    ],
)

print("Retrieved:", retrieved_text)
print("Answer:", answer.choices[0].message.content)

Expected output:

Retrieved: Monthly maintenance fee is waived if the customer maintains a minimum balance of $1,500.
Answer: The monthly maintenance fee is waived when the customer maintains a minimum balance of $1,500.

Real-World Use Cases

•
Customer support agent
- •Answer questions about fees, card replacement timelines, overdraft rules, and account eligibility using approved bank content.
•
Branch staff copilot
- •Let employees ask natural-language questions and retrieve policy snippets before they speak to customers.
•
Onboarding assistant
- •Guide new retail banking customers through account setup, required documents, and common next steps without sending them through long FAQ pages.

Keep learning

•The complete AI Agents Roadmap — my full 8-step breakdown
•Free: The AI Agent Starter Kit — PDF checklist + starter code
•Work with me — I build AI for banks and insurance companies

By Cyprian Aarons, AI Consultant at Topiax.

ShareX / Twitter LinkedIn

Want the complete 8-step roadmap?

Grab the free AI Agent Starter Kit — architecture templates, compliance checklists, and a 7-email deep-dive course.

Get the Starter Kit