How to Integrate OpenAI for banking with Pinecone for production AI

By Cyprian AaronsUpdated 2026-04-21

openai-for-bankingpineconeproduction-ai

Combining OpenAI for banking with Pinecone gives you the core pattern for production-grade banking agents: generate answers from a model, then ground those answers in your own indexed policy, product, and customer context. That means fewer hallucinations, faster retrieval of relevant documents, and a cleaner path to building assistants for KYC, support, fraud ops, and internal knowledge search.

Prerequisites

•Python 3.10+
•An OpenAI API key with access to the model you plan to use
•A Pinecone account and API key
•A Pinecone index created in the same embedding dimension as your chosen embedding model
•pip installed
•
Banking documents ready to chunk and embed:
- •product disclosures
- •policy PDFs
- •compliance playbooks
- •FAQ content
•
Environment variables set:
- •OPENAI_API_KEY
- •PINECONE_API_KEY
- •PINECONE_INDEX_NAME

Install the SDKs:

pip install openai pinecone python-dotenv

Integration Steps

1) Initialize OpenAI and Pinecone clients

Start by loading secrets from the environment and creating both clients. Keep this in one module so your agent layer can import it cleanly.

import os
from dotenv import load_dotenv
from openai import OpenAI
from pinecone import Pinecone

load_dotenv()

openai_client = OpenAI(api_key=os.environ["OPENAI_API_KEY"])
pc = Pinecone(api_key=os.environ["PINECONE_API_KEY"])

index_name = os.environ["PINECONE_INDEX_NAME"]
index = pc.Index(index_name)

If you are building for banking, keep this setup isolated from business logic. You want credential handling, retry policy, and observability in one place.

2) Create embeddings with OpenAI and upsert them into Pinecone

Use OpenAI embeddings for your document chunks, then store vectors plus metadata in Pinecone. The metadata is what lets you filter by document type, region, or policy version later.

from openai import OpenAI
from pinecone import Pinecone

client = OpenAI()
pc = Pinecone(api_key=os.environ["PINECONE_API_KEY"])
index = pc.Index(os.environ["PINECONE_INDEX_NAME"])

docs = [
    {
        "id": "policy_001",
        "text": "Customers must verify identity before account changes are processed.",
        "metadata": {"source": "kYC_policy", "doc_type": "policy", "region": "us"}
    },
    {
        "id": "faq_014",
        "text": "Wire transfers above $10,000 require additional review.",
        "metadata": {"source": "ops_faq", "doc_type": "faq", "region": "us"}
    }
]

embeddings = client.embeddings.create(
    model="text-embedding-3-small",
    input=[d["text"] for d in docs]
)

vectors = []
for doc, emb in zip(docs, embeddings.data):
    vectors.append({
        "id": doc["id"],
        "values": emb.embedding,
        "metadata": {**doc["metadata"], "text": doc["text"]}
    })

index.upsert(vectors=vectors)

A few production notes:

•Chunk before embedding. Don’t dump entire PDFs into one vector.
•Store the original text in metadata only if it fits your governance rules.
•Use stable IDs so re-indexing is idempotent.

3) Retrieve relevant banking context from Pinecone

At query time, embed the user question with the same OpenAI embedding model, then retrieve top matches from Pinecone.

query = "What checks are required before changing customer account details?"

query_embedding = client.embeddings.create(
    model="text-embedding-3-small",
    input=query
).data[0].embedding

results = index.query(
    vector=query_embedding,
    top_k=3,
    include_metadata=True,
    filter={"region": {"$eq": "us"}}
)

for match in results.matches:
    print(match.id, match.score)
    print(match.metadata["text"])

This is the retrieval layer your agent will rely on. For banking workloads, filters matter because policy often varies by region, product line, or customer segment.

4) Send retrieved context into an OpenAI chat completion

Now pass the retrieved snippets into the model as grounded context. This is where OpenAI for banking becomes useful: it can answer questions using your internal policy rather than guessing.

context_blocks = []
for match in results.matches:
    context_blocks.append(f"- {match.metadata['text']}")

system_prompt = """
You are a banking operations assistant.
Answer only using the provided context.
If the context is insufficient, say you do not have enough information.
"""

user_prompt = f"""
Question: {query}

Context:
{chr(10).join(context_blocks)}
"""

response = client.chat.completions.create(
    model="gpt-4o-mini",
    messages=[
        {"role": "system", "content": system_prompt},
        {"role": "user", "content": user_prompt}
    ],
    temperature=0.0
)

print(response.choices[0].message.content)

For production AI systems:

•Set temperature=0 for policy-driven answers.
•Force citation-style responses if your compliance team needs traceability.
•Reject answers when retrieval confidence is low.

5) Wrap retrieval + generation into one reusable function

This is the pattern you actually ship inside an agent service.

def answer_banking_question(question: str) -> str:
    q_emb = client.embeddings.create(
        model="text-embedding-3-small",
        input=question
    ).data[0].embedding

    matches = index.query(
        vector=q_emb,
        top_k=5,
        include_metadata=True,
        filter={"region": {"$eq": "us"}}
    ).matches

    context = "\n".join(
        f"[{m.id}] {m.metadata['text']}" for m in matches
    )

    messages = [
        {
            "role": "system",
            "content": (
                "You are a banking assistant. "
                "Use only the provided context. "
                "If unsure, say you need more information."
            )
        },
        {
            "role": "user",
            "content": f"Question: {question}\n\nContext:\n{context}"
        }
    ]

    resp = client.chat.completions.create(
        model="gpt-4o-mini",
        messages=messages,
        temperature=0.0
    )
    return resp.choices[0].message.content.strip()

Testing the Integration

Run a simple end-to-end test with a banking-style question:

if __name__ == "__main__":
    answer = answer_banking_question(
        "Can we change account details without identity verification?"
    )
    print(answer)

Expected output should look like this:

Customers must verify identity before account changes are processed.
If identity verification has not been completed, account details should not be changed.

If you get an empty or irrelevant answer:

•Check that embeddings were created with the same model used at query time.
•Confirm your Pinecone index dimension matches the embedding dimension.
•Inspect metadata filters; they can silently exclude valid matches.
•Verify that retrieved text actually contains policy language worth answering from.

Real-World Use Cases

•
KYC and onboarding assistant
- •Answer analyst questions using internal onboarding policies, document requirements, and regional compliance rules.
•
Customer support copilot
- •Ground responses in product FAQs and service policies so support agents get consistent answers on fees, limits, and transfer rules.
•
Fraud operations knowledge search
- •Let investigators query playbooks, escalation paths, and case handling procedures without digging through shared drives.

Keep learning

•The complete AI Agents Roadmap — my full 8-step breakdown
•Free: The AI Agent Starter Kit — PDF checklist + starter code
•Work with me — I build AI for banks and insurance companies

By Cyprian Aarons, AI Consultant at Topiax.

ShareX / Twitter LinkedIn

Want the complete 8-step roadmap?

Grab the free AI Agent Starter Kit — architecture templates, compliance checklists, and a 7-email deep-dive course.

Get the Starter Kit