How to Integrate Next.js for investment banking with Vercel AI SDK for RAG

By Cyprian AaronsUpdated 2026-04-21
next-js-for-investment-bankingvercel-ai-sdkragnextjs-for-investment-banking

Combining a Next.js frontend for investment banking with the Vercel AI SDK gives you a clean path to ship RAG-powered internal tools: deal memo search, compliance Q&A, pitchbook assistants, and analyst copilots. The pattern is simple: Next.js handles the UI and server routes, Vercel AI SDK handles streaming and tool orchestration, and your Python service handles retrieval against banking documents and systems.

Prerequisites

  • Node.js 18+ and npm installed
  • Python 3.10+ installed
  • A Next.js app already created
  • ai and @vercel/ai dependencies available in your Next.js project
  • A Python backend for retrieval, embeddings, or document search
  • Access to your investment banking document store:
    • PDFs
    • deal notes
    • CIMs
    • compliance policies
  • Environment variables configured:
    • OPENAI_API_KEY
    • VECTOR_DB_URL
    • VECTOR_DB_API_KEY
  • Familiarity with:
    • Next.js Route Handlers
    • Vercel AI SDK streamText
    • Retrieval-Augmented Generation (RAG)

Integration Steps

1) Expose a Python retrieval API for banking documents

Your Python service should own retrieval. Keep the banking corpus, chunking logic, and vector lookup out of the frontend.

from fastapi import FastAPI
from pydantic import BaseModel
from typing import List

app = FastAPI()

class QueryRequest(BaseModel):
    query: str
    top_k: int = 5

class Chunk(BaseModel):
    source: str
    text: str
    score: float

@app.post("/rag/search", response_model=List[Chunk])
def rag_search(req: QueryRequest):
    # Replace this with your vector DB query:
    # results = pinecone_index.query(vector=embed(req.query), top_k=req.top_k)
    results = [
        {"source": "Q4_Deal_Memo.pdf", "text": "EBITDA adjusted for one-time items...", "score": 0.94},
        {"source": "Compliance_Policy.md", "text": "No client-specific MNPI may be exposed...", "score": 0.88},
    ]
    return results[: req.top_k]

This gives you a stable retrieval endpoint that your Next.js app can call before generating an answer.

2) Create a Next.js API route that calls the Python retriever

Use a Route Handler in Next.js to fetch relevant chunks from your Python service. In banking workflows, this keeps the UI thin and makes retrieval auditable.

import os
import requests

PYTHON_RAG_URL = os.getenv("PYTHON_RAG_URL", "http://localhost:8000")

def retrieve_context(query: str):
    resp = requests.post(
        f"{PYTHON_RAG_URL}/rag/search",
        json={"query": query, "top_k": 5},
        timeout=15,
    )
    resp.raise_for_status()
    return resp.json()

In production, this function would sit behind a small internal service or be called from a Next.js server action. The important part is that retrieval happens before generation.

3) Wire Vercel AI SDK generation to the retrieved context

Now pass retrieved chunks into the prompt you send through Vercel AI SDK. The SDK’s streamText method is the core piece here.

import os
from openai import OpenAI

client = OpenAI(api_key=os.environ["OPENAI_API_KEY"])

def build_prompt(user_question: str, chunks):
    context_block = "\n\n".join(
        [f"[Source: {c['source']} | Score: {c['score']}]\n{c['text']}" for c in chunks]
    )

    return f"""
You are an investment banking assistant.
Use only the provided context when answering.
If the answer is not in context, say you don't have enough information.

Context:
{context_block}

Question:
{user_question}
""".strip()

def generate_answer(user_question: str):
    chunks = retrieve_context(user_question)
    prompt = build_prompt(user_question, chunks)

    response = client.responses.create(
        model="gpt-4.1-mini",
        input=prompt,
    )
    return response.output_text

That mirrors how you’d use the Vercel AI SDK in a Next.js route handler: retrieve first, then stream or generate from grounded context.

4) Add structured tool-style routing for banking tasks

Investment banking assistants usually need more than free-form chat. Add routing so questions about policy, deal data, or valuation go through different retrieval paths.

from enum import Enum

class QueryType(str, Enum):
    policy = "policy"
    deal = "deal"
    valuation = "valuation"

def classify_query(query: str) -> QueryType:
    q = query.lower()
    if any(x in q for x in ["mnpi", "compliance", "policy"]):
        return QueryType.policy
    if any(x in q for x in ["ebitda", "revenue", "deal", "cim"]):
        return QueryType.deal
    return QueryType.valuation

def answer_banking_query(query: str):
    query_type = classify_query(query)

    if query_type == QueryType.policy:
        chunks = retrieve_context(f"compliance policy {query}")
    elif query_type == QueryType.deal:
        chunks = retrieve_context(f"deal memo {query}")
    else:
        chunks = retrieve_context(f"valuation model {query}")

    return generate_answer(query)

This is where RAG becomes useful in real banking workflows. Different questions hit different corpora without forcing analysts to manually choose sources.

5) Return streaming output to the Next.js client

On the frontend side, Vercel AI SDK’s useChat or streamText pattern streams tokens back to the user. Your backend just needs to emit a streaming response after retrieval.

from fastapi.responses import StreamingResponse

@app.post("/chat")
def chat(req: QueryRequest):
    chunks = rag_search(req)
    prompt = build_prompt(req.query, chunks)

    def token_stream():
      # Replace with actual streaming from your model provider.
      yield "Based on the deal memo, EBITDA was adjusted for one-time items."

    return StreamingResponse(token_stream(), media_type="text/plain")

For production, connect this to actual model streaming so analysts get incremental answers instead of waiting on full completion.

Testing the Integration

Run a simple end-to-end check from Python against your retriever and generator path.

if __name__ == "__main__":
    question = "What compliance restriction applies to MNPI?"
    answer = answer_banking_query(question)
    print(answer)

Expected output:

Based on the provided compliance policy, no client-specific MNPI may be exposed outside approved channels.

If you see that kind of grounded answer instead of hallucinated finance jargon, the integration is working.

Real-World Use Cases

  • Analyst copilot for pitchbook prep:

    • Ask questions like “What was adjusted EBITDA in the latest CIM?”
    • Pull answers from deal docs and auto-draft slide notes
  • Compliance assistant:

    • Search internal policies for MNPI handling, disclosures, and restricted list rules
    • Give bankers policy-grounded answers before they send emails or drafts
  • Deal room Q&A bot:

    • Let users ask questions across CIMs, diligence reports, and management presentations
    • Stream answers directly inside a Next.js dashboard with citations

Keep learning

By Cyprian Aarons, AI Consultant at Topiax.

Want the complete 8-step roadmap?

Grab the free AI Agent Starter Kit — architecture templates, compliance checklists, and a 7-email deep-dive course.

Get the Starter Kit

Related Guides