How to Integrate OpenAI for lending with Pinecone for AI agents
Combining OpenAI for lending with Pinecone gives you a practical pattern for building AI agents that can answer borrower questions, retrieve policy context, and reason over loan documents without stuffing everything into the prompt. OpenAI handles the language and decisioning layer, while Pinecone gives the agent fast semantic retrieval over underwriting guides, product terms, KYC notes, and historical case files.
Prerequisites
- •Python 3.10+
- •An OpenAI API key
- •A Pinecone API key and an existing index
- •Access to your lending content:
- •loan product docs
- •underwriting policies
- •FAQ knowledge base
- •customer support transcripts or case notes
- •Installed packages:
- •
openai - •
pinecone - •
python-dotenv
- •
pip install openai pinecone python-dotenv
Integration Steps
- •Set your environment variables and initialize both clients.
import os
from dotenv import load_dotenv
from openai import OpenAI
from pinecone import Pinecone
load_dotenv()
OPENAI_API_KEY = os.getenv("OPENAI_API_KEY")
PINECONE_API_KEY = os.getenv("PINECONE_API_KEY")
PINECONE_INDEX_NAME = os.getenv("PINECONE_INDEX_NAME", "lending-agent-index")
client = OpenAI(api_key=OPENAI_API_KEY)
pc = Pinecone(api_key=PINECONE_API_KEY)
index = pc.Index(PINECONE_INDEX_NAME)
- •Create embeddings for lending documents with OpenAI.
Use the embedding model to convert policy text into vectors before storing them in Pinecone.
documents = [
{
"id": "loan_policy_001",
"text": "Personal loans above $25,000 require proof of income and two recent bank statements.",
"metadata": {"type": "policy", "product": "personal_loan"}
},
{
"id": "loan_policy_002",
"text": "Applicants with a credit score below 620 require manual review by an underwriter.",
"metadata": {"type": "policy", "product": "underwriting"}
}
]
texts = [doc["text"] for doc in documents]
embedding_response = client.embeddings.create(
model="text-embedding-3-small",
input=texts
)
vectors = []
for doc, emb in zip(documents, embedding_response.data):
vectors.append({
"id": doc["id"],
"values": emb.embedding,
"metadata": {**doc["metadata"], "text": doc["text"]}
})
- •Upsert the vectors into Pinecone.
This is where your agent’s long-term memory starts. Store both the vector and enough metadata to reconstruct the answer later.
index.upsert(vectors=vectors)
print(f"Upserted {len(vectors)} lending documents into Pinecone.")
- •Retrieve relevant context at query time and feed it into OpenAI.
This is the core agent loop: embed the user question, query Pinecone, then pass retrieved context to the model.
user_question = "What documents are needed for a personal loan over $25,000?"
query_embedding = client.embeddings.create(
model="text-embedding-3-small",
input=user_question
).data[0].embedding
results = index.query(
vector=query_embedding,
top_k=3,
include_metadata=True
)
context_chunks = []
for match in results["matches"]:
context_chunks.append(match["metadata"]["text"])
context = "\n".join(context_chunks)
response = client.responses.create(
model="gpt-4.1-mini",
input=f"""
You are a lending assistant.
Answer only using the provided context.
Context:
{context}
Question:
{user_question}
"""
)
print(response.output_text)
- •Wrap retrieval + generation into a reusable agent function.
In production, you do not want retrieval logic scattered across handlers. Keep it in one function so your agent can be called from an API route, queue worker, or orchestration layer.
def answer_lending_question(question: str) -> str:
q_emb = client.embeddings.create(
model="text-embedding-3-small",
input=question
).data[0].embedding
matches = index.query(
vector=q_emb,
top_k=3,
include_metadata=True
)["matches"]
context = "\n".join(m["metadata"]["text"] for m in matches)
result = client.responses.create(
model="gpt-4.1-mini",
input=f"""
You are a lending operations assistant.
Use only this context to answer.
Context:
{context}
Question:
{question}
"""
)
return result.output_text
print(answer_lending_question("When does a borrower need manual underwriting review?"))
Testing the Integration
Run a simple end-to-end check: insert one policy snippet, query it, then confirm the response cites the right rule.
test_question = "What happens if the applicant's credit score is below 620?"
answer = answer_lending_question(test_question)
print("QUESTION:", test_question)
print("ANSWER:", answer)
Expected output:
QUESTION: What happens if the applicant's credit score is below 620?
ANSWER: Applicants with a credit score below 620 require manual review by an underwriter.
If you get an unrelated answer, check these first:
- •The embedding model used for indexing and querying is the same family.
- •Your Pinecone index dimension matches
text-embedding-3-small. - •You included
include_metadata=True. - •Your prompt restricts the model to retrieved context only.
Real-World Use Cases
- •
Loan policy assistant
- •Let internal ops teams ask questions like “What docs are needed for SME loans above $100k?” and get answers grounded in current policy docs.
- •
Borrower support agent
- •Build a chat agent that explains eligibility rules, required paperwork, repayment terms, and escalation paths without hand-coded decision trees.
- •
Underwriting copilot
- •Retrieve similar historical cases, compare them against current application data, and help underwriters triage applications faster.
Keep learning
- •The complete AI Agents Roadmap — my full 8-step breakdown
- •Free: The AI Agent Starter Kit — PDF checklist + starter code
- •Work with me — I build AI for banks and insurance companies
By Cyprian Aarons, AI Consultant at Topiax.
Want the complete 8-step roadmap?
Grab the free AI Agent Starter Kit — architecture templates, compliance checklists, and a 7-email deep-dive course.
Get the Starter Kit