How to Integrate OpenAI for payments with Pinecone for AI agents

By Cyprian AaronsUpdated 2026-04-21
openai-for-paymentspineconeai-agents

Combining OpenAI for payments with Pinecone gives you a practical agent stack: one side handles payment-related reasoning and workflow decisions, the other stores and retrieves merchant, invoice, policy, or transaction context at scale. That means your AI agent can answer payment questions, route disputes, surface relevant customer history, and keep state across conversations without stuffing everything into the prompt.

Prerequisites

  • Python 3.10+
  • An OpenAI account with API access enabled
  • A Pinecone account and an active index
  • API keys set as environment variables:
    • OPENAI_API_KEY
    • PINECONE_API_KEY
  • pip installed
  • A vector embedding model you’ll use for indexing payment documents
  • A dataset of payment-related records:
    • invoices
    • chargeback notes
    • refund policies
    • transaction metadata
    • merchant FAQs

Install the SDKs:

pip install openai pinecone python-dotenv

Integration Steps

1) Initialize OpenAI and Pinecone clients

Use OpenAI for generating embeddings or agent responses, and Pinecone for vector storage and retrieval.

import os
from dotenv import load_dotenv
from openai import OpenAI
from pinecone import Pinecone

load_dotenv()

openai_client = OpenAI(api_key=os.getenv("OPENAI_API_KEY"))
pinecone_client = Pinecone(api_key=os.getenv("PINECONE_API_KEY"))

index_name = "payments-agent-index"
index = pinecone_client.Index(index_name)

If you’re building a production agent, keep the OpenAI client focused on language tasks and Pinecone focused on retrieval. Don’t mix those responsibilities.

2) Convert payment knowledge into embeddings

For an AI agent system, you usually want to embed support docs, payment policies, or ledger notes before storing them in Pinecone.

docs = [
    {
        "id": "refund-policy-001",
        "text": "Refunds are allowed within 30 days for duplicate charges with valid proof.",
        "metadata": {"type": "policy", "source": "billing_docs"}
    },
    {
        "id": "chargeback-guide-001",
        "text": "Chargebacks must be escalated within 5 business days with transaction ID and reason code.",
        "metadata": {"type": "procedure", "source": "ops_docs"}
    }
]

embeddings = []
for doc in docs:
    response = openai_client.embeddings.create(
        model="text-embedding-3-small",
        input=doc["text"]
    )
    vector = response.data[0].embedding
    embeddings.append((doc["id"], vector, doc["metadata"], doc["text"]))

This is the first place OpenAI and Pinecone work together: OpenAI turns text into vectors, Pinecone stores them for fast similarity search.

3) Upsert vectors into Pinecone

Now write those embeddings into your index.

vectors = []
for doc_id, vector, metadata, text in embeddings:
    vectors.append({
        "id": doc_id,
        "values": vector,
        "metadata": {
            **metadata,
            "text": text
        }
    })

upsert_result = index.upsert(vectors=vectors)
print(upsert_result)

At this point your payment knowledge base is queryable by semantic meaning instead of exact keywords. That matters when users ask things like “Can I reverse this duplicate card charge?” instead of “refund policy.”

4) Query Pinecone from the agent using an OpenAI embedding

When a user asks a question, embed the query with OpenAI and retrieve the closest payment context from Pinecone.

user_query = "How do I handle a duplicate card charge refund?"

query_embedding = openai_client.embeddings.create(
    model="text-embedding-3-small",
    input=user_query
).data[0].embedding

matches = index.query(
    vector=query_embedding,
    top_k=3,
    include_metadata=True
)

for match in matches["matches"]:
    print(match["id"], match["score"], match["metadata"]["text"])

This gives your agent grounded context before it calls the chat model. For regulated workflows like payments, that grounding is what keeps responses consistent with policy.

5) Generate the final agent response with retrieved context

Take the retrieved snippets and feed them into an OpenAI chat completion call.

context_blocks = [
    m["metadata"]["text"]
    for m in matches["matches"]
]

system_prompt = (
    "You are a payments operations assistant. "
    "Use only the provided context to answer."
)

user_prompt = f"""
Question: {user_query}

Context:
{chr(10).join(context_blocks)}
"""

chat_response = openai_client.chat.completions.create(
    model="gpt-4o-mini",
    messages=[
        {"role": "system", "content": system_prompt},
        {"role": "user", "content": user_prompt}
    ]
)

print(chat_response.choices[0].message.content)

That’s the basic retrieval-augmented pattern: OpenAI interprets and responds, while Pinecone supplies relevant memory from your payments corpus.

Testing the Integration

Run a simple end-to-end test: embed a query, retrieve from Pinecone, then generate an answer.

test_query = "What should I do if a customer reports a duplicate charge?"

query_embedding = openai_client.embeddings.create(
    model="text-embedding-3-small",
    input=test_query
).data[0].embedding

results = index.query(
    vector=query_embedding,
    top_k=1,
    include_metadata=True
)

context = results["matches"][0]["metadata"]["text"]

response = openai_client.chat.completions.create(
    model="gpt-4o-mini",
    messages=[
        {"role": "system", "content": "Answer using only retrieved payment policy context."},
        {"role": "user", "content": f"Question: {test_query}\n\nContext: {context}"}
    ]
)

print("Retrieved context:", context)
print("Agent answer:", response.choices[0].message.content)

Expected output:

Retrieved context: Refunds are allowed within 30 days for duplicate charges with valid proof.
Agent answer: The customer should submit proof of the duplicate charge. If it falls within 30 days, process the refund according to policy.

If you get a relevant match and a policy-aligned answer, the integration is working.

Real-World Use Cases

  • Payment support agents

    • Answer refund, dispute, chargeback, and reconciliation questions using indexed policy docs and transaction notes.
  • Merchant operations copilots

    • Let internal teams ask natural-language questions about failed payments, settlement delays, or payout exceptions.
  • Fraud triage assistants

    • Retrieve similar historical cases from Pinecone and use OpenAI to summarize likely next actions for analysts.

The pattern here is simple: use OpenAI to understand language and generate responses, use Pinecone to hold long-term payment memory. In production AI agents for banking or insurance adjacent workflows, that split keeps your system maintainable and easier to audit.


Keep learning

By Cyprian Aarons, AI Consultant at Topiax.

Want the complete 8-step roadmap?

Grab the free AI Agent Starter Kit — architecture templates, compliance checklists, and a 7-email deep-dive course.

Get the Starter Kit

Related Guides