How to Integrate OpenAI for retail banking with Pinecone for AI agents

By Cyprian AaronsUpdated 2026-04-21

openai-for-retail-bankingpineconeai-agents

Retail banking agents need two things to be useful: a strong reasoning model and a retrieval layer that can pull the right policy, product, or customer context fast. OpenAI gives you the agent brain; Pinecone gives you the memory layer for embeddings, so your assistant can answer with bank-specific grounding instead of generic chat.

Prerequisites

•Python 3.10+
•An OpenAI API key
•A Pinecone API key and an existing index
•pip installed
•
A retail banking knowledge base to embed:
- •product FAQs
- •fee schedules
- •KYC/AML policy docs
- •support playbooks
•
These Python packages:
- •openai
- •pinecone
- •python-dotenv

Install them:

pip install openai pinecone python-dotenv

Set your environment variables:

export OPENAI_API_KEY="your-openai-key"
export PINECONE_API_KEY="your-pinecone-key"
export PINECONE_INDEX_NAME="retail-banking-agent"

Integration Steps

1) Initialize both SDKs

Use OpenAI for embeddings and chat completions, and Pinecone for vector storage and retrieval.

import os
from dotenv import load_dotenv
from openai import OpenAI
from pinecone import Pinecone

load_dotenv()

openai_client = OpenAI(api_key=os.environ["OPENAI_API_KEY"])
pc = Pinecone(api_key=os.environ["PINECONE_API_KEY"])

index_name = os.environ["PINECONE_INDEX_NAME"]
index = pc.Index(index_name)

At this point you have:

•an OpenAI client for generating embeddings and responses
•a Pinecone index handle for upsert/query operations

2) Prepare retail banking documents and create embeddings

Chunk your source content before embedding. In production, keep chunks small enough to retrieve precisely, usually 200–500 tokens.

bank_docs = [
    {
        "id": "fee_schedule_001",
        "text": "Checking account overdraft fee is $35 per item. Daily limit applies.",
        "metadata": {"source": "fees", "product": "checking"}
    },
    {
        "id": "kyc_policy_001",
        "text": "KYC requires government ID, proof of address, and date of birth verification.",
        "metadata": {"source": "compliance", "topic": "kyc"}
    },
    {
        "id": "card_dispute_001",
        "text": "Card disputes must be filed within 60 days of the transaction date.",
        "metadata": {"source": "support", "topic": "disputes"}
    }
]

def embed_text(text: str):
    response = openai_client.embeddings.create(
        model="text-embedding-3-small",
        input=text
    )
    return response.data[0].embedding

vectors = []
for doc in bank_docs:
    vectors.append({
        "id": doc["id"],
        "values": embed_text(doc["text"]),
        "metadata": {
            **doc["metadata"],
            "text": doc["text"]
        }
    })

Use text-embedding-3-small unless you have a reason to pay for larger embeddings. For most banking retrieval tasks, it’s enough.

3) Upsert embeddings into Pinecone

Now persist those vectors in Pinecone so your agent can retrieve them later.

index.upsert(vectors=vectors)
print(f"Upserted {len(vectors)} vectors into {index_name}")

A few production notes:

•store the original text in metadata if your chunk size is small enough
•add document type, jurisdiction, product line, and version tags
•avoid storing sensitive customer data in plain metadata unless your security posture allows it

4) Query Pinecone from the user question

When a user asks a banking question, embed the query first, then search Pinecone for relevant context.

def retrieve_context(question: str, top_k: int = 3):
    q_embedding = embed_text(question)
    results = index.query(
        vector=q_embedding,
        top_k=top_k,
        include_metadata=True
    )

    matches = results.get("matches", [])
    contexts = []
    for match in matches:
        md = match["metadata"]
        contexts.append(md.get("text", ""))

    return contexts

question = "What documents do I need to open a checking account?"
contexts = retrieve_context(question)

for i, ctx in enumerate(contexts, start=1):
    print(f"{i}. {ctx}")

This is the core retrieval step. The agent should not answer from memory when policy or product details matter.

5) Send retrieved context to OpenAI and generate the answer

Pass the retrieved snippets into a chat completion call so the model answers using grounded context.

def answer_question(question: str):
    contexts = retrieve_context(question)

    system_prompt = (
        "You are a retail banking assistant. "
        "Answer only using the provided context. "
        "If the context is insufficient, say what is missing."
    )

    user_prompt = f"""
Question: {question}

Context:
{chr(10).join(f"- {c}" for c in contexts)}
"""

    response = openai_client.chat.completions.create(
        model="gpt-4o-mini",
        messages=[
            {"role": "system", "content": system_prompt},
            {"role": "user", "content": user_prompt}
        ],
        temperature=0.2
    )

    return response.choices[0].message.content

print(answer_question("What documents do I need to open a checking account?"))

If you’re building an actual agent loop, this retrieval step becomes a tool call. The pattern stays the same:

•user asks question
•retrieve relevant bank knowledge from Pinecone
•feed context to OpenAI
•return grounded answer

Testing the Integration

Run a simple end-to-end check with a compliance-style query.

test_question = "How long do customers have to file a card dispute?"
answer = answer_question(test_question)

print("QUESTION:", test_question)
print("ANSWER:", answer)

Expected output:

QUESTION: How long do customers have to file a card dispute?
ANSWER: Customers must file card disputes within 60 days of the transaction date.

If you get an unrelated answer:

•check that your Pinecone index has vectors loaded
•verify embedding model consistency between upsert and query paths
•confirm your chunks actually contain the policy text you expect

Real-World Use Cases

•
Retail banking support bot
- •Answer questions about fees, card disputes, transfers, branch hours, and onboarding requirements using bank-approved content.
•
Compliance-aware agent
- •Retrieve KYC/AML policies and generate internal guidance for frontline staff without exposing unsupported claims.
•
Personalized product assistant
- •Combine customer profile data with indexed product docs to recommend savings accounts, credit cards, or loan options based on eligibility rules.

The useful pattern here is simple: OpenAI handles language generation, Pinecone handles retrieval over bank knowledge. If you keep those responsibilities cleanly separated, you get an AI agent that is faster to ship and much easier to control in production.

Keep learning

•The complete AI Agents Roadmap — my full 8-step breakdown
•Free: The AI Agent Starter Kit — PDF checklist + starter code
•Work with me — I build AI for banks and insurance companies

By Cyprian Aarons, AI Consultant at Topiax.

ShareX / Twitter LinkedIn

Want the complete 8-step roadmap?

Grab the free AI Agent Starter Kit — architecture templates, compliance checklists, and a 7-email deep-dive course.

Get the Starter Kit