How to Integrate OpenAI for payments with Pinecone for multi-agent systems

By Cyprian AaronsUpdated 2026-04-21

openai-for-paymentspineconemulti-agent-systems

OpenAI for payments and Pinecone solve two different parts of the same system problem.

OpenAI handles the agent’s reasoning and payment-related workflow decisions, while Pinecone gives you persistent vector memory across multiple agents. Combined, you can build systems that route payment intents, recover customer context, and keep every agent working from the same long-term state.

Prerequisites

•Python 3.10+
•An OpenAI API key
•A Pinecone API key
•A Pinecone index created with the right dimension for your embedding model
•
pip installed packages:
- •openai
- •pinecone
- •python-dotenv
•
A clear multi-agent design:
- •one agent for payment intent classification
- •one agent for fraud/risk checks
- •one agent for customer support or escalation

Install dependencies:

pip install openai pinecone python-dotenv

Set environment variables:

export OPENAI_API_KEY="your-openai-key"
export PINECONE_API_KEY="your-pinecone-key"
export PINECONE_INDEX_NAME="payments-agents"

Integration Steps

1) Initialize both clients

Start by wiring the OpenAI client and Pinecone client in the same service. Keep this in a shared module so every agent uses the same connection layer.

import os
from openai import OpenAI
from pinecone import Pinecone

openai_client = OpenAI(api_key=os.environ["OPENAI_API_KEY"])
pc = Pinecone(api_key=os.environ["PINECONE_API_KEY"])

index_name = os.environ["PINECONE_INDEX_NAME"]
index = pc.Index(index_name)

If you are running multiple agents, do not let each agent create its own isolated client config. Share this setup through your agent runtime or dependency injection container.

2) Create embeddings for payment-related context

Use OpenAI embeddings to turn transaction notes, customer messages, and policy text into vectors. Store those vectors in Pinecone so any agent can retrieve them later.

from openai import OpenAI

client = OpenAI()

text = "Customer requested refund on card payment after duplicate charge."
embedding_response = client.embeddings.create(
    model="text-embedding-3-small",
    input=text
)

vector = embedding_response.data[0].embedding
print(len(vector))

Now upsert that vector into Pinecone with metadata that helps downstream agents filter results.

import uuid

record_id = str(uuid.uuid4())

index.upsert(
    vectors=[
        {
            "id": record_id,
            "values": vector,
            "metadata": {
                "type": "payment_case",
                "customer_id": "cust_123",
                "status": "pending_refund",
                "source": "support_chat"
            }
        }
    ],
    namespace="payments"
)

This is the core pattern: OpenAI generates the representation, Pinecone stores it for retrieval by all agents.

3) Retrieve memory for a specific agent task

When a new payment request arrives, query Pinecone for similar historical cases before calling an LLM. This gives your orchestrator context that is grounded in prior outcomes.

query_text = "Duplicate card charge refund request"
query_embedding = client.embeddings.create(
    model="text-embedding-3-small",
    input=query_text
).data[0].embedding

matches = index.query(
    vector=query_embedding,
    top_k=3,
    include_metadata=True,
    namespace="payments"
)

for match in matches["matches"]:
    print(match["id"], match["score"], match["metadata"])

Use this result to decide which agent should handle the case:

•support agent if it is a standard refund request
•risk agent if it mentions chargeback or fraud
•payments ops agent if settlement or reconciliation is involved

4) Use OpenAI to route the task across agents

Now combine retrieved memory with an LLM call. The model decides which specialist agent should act next based on current input plus Pinecone context.

context_snippets = [
    m["metadata"] for m in matches["matches"]
]

prompt = f"""
You are a payment orchestration router.
Given the user request and historical cases, choose one action:
- support_agent
- risk_agent
- billing_agent

User request: {query_text}
Historical context: {context_snippets}

Return only the action name.
"""

response = openai_client.responses.create(
    model="gpt-4.1-mini",
    input=prompt
)

print(response.output_text)

In production, do not let this step directly execute money movement. Use it to produce a route decision only. The actual payment action should be handled by a controlled service with audit logs and approval checks.

5) Persist multi-agent outcomes back into Pinecone

After each specialist finishes its work, store the outcome so future agents can reuse it. This is how you build durable shared memory across a multi-agent system.

outcome_text = """
Support confirmed duplicate charge.
Refund approved by billing.
Customer notified via email.
"""

outcome_embedding = client.embeddings.create(
    model="text-embedding-3-small",
    input=outcome_text
).data[0].embedding

index.upsert(
    vectors=[
        {
            "id": str(uuid.uuid4()),
            "values": outcome_embedding,
            "metadata": {
                "type": "resolution",
                "customer_id": "cust_123",
                "result": "refund_approved",
                "agent": "billing_agent"
            }
        }
    ],
    namespace="payments"
)

That closes the loop:

•OpenAI reasons over the case
•Pinecone stores memory
•every subsequent agent gets better context

Testing the Integration

Run a simple end-to-end check: embed a sample issue, store it, query it back, then ask OpenAI to classify it.

test_text = "Card charged twice during checkout"

test_vec = client.embeddings.create(
    model="text-embedding-3-small",
    input=test_text
).data[0].embedding

index.upsert(
    vectors=[{
        "id": "test-payment-case",
        "values": test_vec,
        "metadata": {"type": "test_case", "status": "duplicate_charge"}
    }],
    namespace="payments"
)

result = index.query(
    vector=test_vec,
    top_k=1,
    include_metadata=True,
    namespace="payments"
)

classification = openai_client.responses.create(
    model="gpt-4.1-mini",
    input=f"Classify this payment issue: {test_text}"
)

print("Pinecone match:", result["matches"][0]["metadata"])
print("OpenAI output:", classification.output_text)

Expected output:

Pinecone match: {'type': 'test_case', 'status': 'duplicate_charge'}
OpenAI output: duplicate_charge_or_refund_request

If your query returns no match, check:

•index dimension matches your embedding model
•namespace is consistent between upsert and query
•API keys are loaded correctly

Real-World Use Cases

•
Payment support triage
- •Route refund, chargeback, and failed-payment tickets to different agents based on semantic similarity and policy context.
•
Fraud review assistants
- •Keep historical fraud patterns in Pinecone and use OpenAI to summarize suspicious behavior before escalating to analysts.
•
Multi-agent billing operations
- •One agent reconciles invoices, another validates ledger entries, and another drafts customer responses using shared memory from Pinecone.

Keep learning

•The complete AI Agents Roadmap — my full 8-step breakdown
•Free: The AI Agent Starter Kit — PDF checklist + starter code
•Work with me — I build AI for banks and insurance companies

By Cyprian Aarons, AI Consultant at Topiax.

ShareX / Twitter LinkedIn

Want the complete 8-step roadmap?

Grab the free AI Agent Starter Kit — architecture templates, compliance checklists, and a 7-email deep-dive course.

Get the Starter Kit