How to Integrate OpenAI for fintech with Pinecone for startups

By Cyprian AaronsUpdated 2026-04-21
openai-for-fintechpineconestartups

Combining OpenAI for fintech with Pinecone gives you the basic stack for a useful AI agent: a model that can reason over financial queries, and a vector store that can retrieve the right internal context fast. For startups, this is how you move from generic chat to an assistant that can answer policy questions, surface transaction notes, and pull relevant customer or product knowledge on demand.

Prerequisites

  • Python 3.10+
  • An OpenAI API key
  • A Pinecone API key and active index
  • pip installed
  • A dataset to embed, such as:
    • product FAQs
    • compliance docs
    • support tickets
    • transaction metadata summaries
  • Installed packages:
    • openai
    • pinecone
    • python-dotenv

Install dependencies:

pip install openai pinecone python-dotenv

Set your environment variables:

export OPENAI_API_KEY="your-openai-key"
export PINECONE_API_KEY="your-pinecone-key"
export PINECONE_INDEX_NAME="fintech-agent-index"

Integration Steps

1) Initialize both clients

Start by loading credentials and creating SDK clients. Keep this in a small config module so your agent code stays clean.

import os
from dotenv import load_dotenv
from openai import OpenAI
from pinecone import Pinecone

load_dotenv()

openai_client = OpenAI(api_key=os.getenv("OPENAI_API_KEY"))
pc = Pinecone(api_key=os.getenv("PINECONE_API_KEY"))

index_name = os.getenv("PINECONE_INDEX_NAME")
index = pc.Index(index_name)

This gives you two primitives:

  • openai_client.embeddings.create(...) for turning text into vectors
  • index.upsert(...) and index.query(...) for storing and retrieving vectors

2) Embed your fintech documents with OpenAI

Use an embedding model to convert policy text, support answers, or product docs into vectors. For startup use cases, keep chunks small and semantically focused.

documents = [
    {
        "id": "doc-001",
        "text": "Refunds are allowed within 30 days if the transaction has not been settled.",
        "metadata": {"type": "policy", "topic": "refunds"}
    },
    {
        "id": "doc-002",
        "text": "KYC verification requires government ID and proof of address.",
        "metadata": {"type": "compliance", "topic": "kyc"}
    }
]

texts = [doc["text"] for doc in documents]

embedding_response = openai_client.embeddings.create(
    model="text-embedding-3-small",
    input=texts
)

vectors = embedding_response.data

Each item in vectors contains the embedding you’ll store in Pinecone. In production, add chunking before embedding long documents.

3) Upsert embeddings into Pinecone

Now write those vectors into your index. Store metadata alongside each vector so retrieval can filter by document type or topic.

upserts = []
for doc, vec in zip(documents, vectors):
    upserts.append((
        doc["id"],
        vec.embedding,
        doc["metadata"] | {"text": doc["text"]}
    ))

index.upsert(vectors=upserts)
print(f"Upserted {len(upserts)} records into {index_name}")

A practical pattern here is to keep the raw text in metadata for quick debugging. If your payloads get large, store only references like document IDs and fetch full text from your database.

4) Query Pinecone with a user question

When a user asks something, embed the query using the same model, then search Pinecone for the closest matches.

query = "Can I request a refund after settlement?"
query_embedding = openai_client.embeddings.create(
    model="text-embedding-3-small",
    input=[query]
).data[0].embedding

results = index.query(
    vector=query_embedding,
    top_k=3,
    include_metadata=True
)

for match in results.matches:
    print(match.id, match.score, match.metadata)

This is the retrieval step that makes your agent grounded. You are not asking the LLM to guess; you are giving it relevant context first.

5) Generate an answer with OpenAI using retrieved context

Take the top matches from Pinecone and feed them into a chat completion call. This is where OpenAI turns retrieved snippets into a useful response.

context_blocks = []
for match in results.matches:
    context_blocks.append(match.metadata["text"])

context = "\n\n".join(context_blocks)

messages = [
    {
        "role": "system",
        "content": (
            "You are a fintech support assistant. "
            "Answer using only the provided context. "
            "If the context does not contain the answer, say you don't know."
        )
    },
    {
        "role": "user",
        "content": f"Context:\n{context}\n\nQuestion: {query}"
    }
]

response = openai_client.chat.completions.create(
    model="gpt-4o-mini",
    messages=messages,
    temperature=0.2
)

print(response.choices[0].message.content)

That pattern is stable enough for startup production systems:

  • embed once
  • retrieve on every question
  • generate from retrieved context only

Testing the Integration

Use one script to verify both storage and retrieval work end to end.

test_query = "What documents do I need for KYC?"

q_embedding = openai_client.embeddings.create(
    model="text-embedding-3-small",
    input=[test_query]
).data[0].embedding

search_results = index.query(
    vector=q_embedding,
    top_k=1,
    include_metadata=True
)

assert len(search_results.matches) > 0

top_match = search_results.matches[0]
print("Top match:", top_match.id)
print("Score:", top_match.score)
print("Text:", top_match.metadata["text"])

Expected output:

Top match: doc-002
Score: 0.82
Text: KYC verification requires government ID and proof of address.

If you get no matches or low scores across everything, check these first:

  • same embedding model used for indexing and querying
  • correct Pinecone index dimension for the embedding model output
  • namespace mismatch if you use namespaces
  • empty or malformed metadata

Real-World Use Cases

  • Customer support agent for fintech apps
    • Answer questions about refunds, chargebacks, limits, onboarding, and KYC using internal policy docs.
  • Compliance assistant
    • Retrieve regulatory guidance, internal controls, and audit notes before generating responses for ops teams.
  • Sales or account management copilot
    • Pull customer history, product usage notes, and contract details to help teams respond faster with accurate context.

The main pattern is simple: OpenAI handles language generation and embeddings; Pinecone handles retrieval at scale. For startups building AI agents in regulated environments, that combination gives you traceability, relevance, and a cleaner path to production than stuffing everything into prompts.


Keep learning

By Cyprian Aarons, AI Consultant at Topiax.

Want the complete 8-step roadmap?

Grab the free AI Agent Starter Kit — architecture templates, compliance checklists, and a 7-email deep-dive course.

Get the Starter Kit

Related Guides