LangChain Tutorial (Python): running agents in parallel for advanced developers

By Cyprian AaronsUpdated 2026-04-21
langchainrunning-agents-in-parallel-for-advanced-developerspython

This tutorial shows how to run multiple LangChain agents in parallel from Python, collect their outputs, and combine them into a single result. You’d use this when one agent is not enough: for example, when you want separate research, compliance, and summarization agents working at the same time instead of waiting on each other.

What You'll Need

  • Python 3.10+
  • langchain
  • langchain-openai
  • openai API key
  • An OpenAI-compatible model with tool-calling support
  • Basic familiarity with LangChain tools, prompts, and chat models
  • A shell environment where you can set environment variables

Install the packages:

pip install langchain langchain-openai openai

Set your API key:

export OPENAI_API_KEY="your-key-here"

Step-by-Step

  1. Start by defining two independent agents that can work on different parts of the same task. The key idea is that each agent gets its own prompt, its own tools if needed, and its own execution path.
import asyncio
from langchain_openai import ChatOpenAI
from langchain_core.prompts import ChatPromptTemplate
from langchain.agents import create_tool_calling_agent, AgentExecutor

llm = ChatOpenAI(model="gpt-4o-mini", temperature=0)

research_prompt = ChatPromptTemplate.from_messages([
    ("system", "You are a research agent. Return concise findings."),
    ("human", "{input}")
])

analysis_prompt = ChatPromptTemplate.from_messages([
    ("system", "You are an analysis agent. Turn findings into actionable recommendations."),
    ("human", "{input}")
])
  1. Wrap each prompt in an agent executor. In production, this is where you attach tools like web search, SQL access, or internal APIs; here we keep it simple so the parallel execution pattern is clear and executable.
research_agent = create_tool_calling_agent(llm, tools=[], prompt=research_prompt)
analysis_agent = create_tool_calling_agent(llm, tools=[], prompt=analysis_prompt)

research_executor = AgentExecutor(agent=research_agent, tools=[], verbose=False)
analysis_executor = AgentExecutor(agent=analysis_agent, tools=[], verbose=False)
  1. Create async worker functions for each agent. Running them as coroutines lets Python schedule both calls at once instead of blocking on one before starting the other.
async def run_research(topic: str):
    result = await research_executor.ainvoke({"input": f"Research this topic: {topic}"})
    return result["output"]

async def run_analysis(findings: str):
    result = await analysis_executor.ainvoke({"input": f"Analyze these findings:\n{findings}"})
    return result["output"]
  1. Use asyncio.gather() to run both tasks in parallel. This pattern works well when the tasks are independent or when one task can start from a shared input while another does a different job on the same input.
async def run_parallel_agents(topic: str):
    research_task = run_research(topic)
    analysis_task = run_analysis(
        f"Topic: {topic}\n"
        f"Assume the research agent will provide findings separately."
    )

    research_result, analysis_result = await asyncio.gather(
        research_task,
        analysis_task,
    )

    return {
        "research": research_result,
        "analysis": analysis_result,
    }
  1. If you want true dependency chaining, do it in two phases: first gather parallel results, then feed them into a final synthesis step. This is the pattern you want for multi-agent systems in banking or insurance workflows where parallel collection happens first and controlled consolidation happens second.
synthesis_prompt = ChatPromptTemplate.from_messages([
    ("system", "You are a synthesis agent. Combine inputs into one practical answer."),
    ("human", "{input}")
])

synthesis_agent = create_tool_calling_agent(llm, tools=[], prompt=synthesis_prompt)
synthesis_executor = AgentExecutor(agent=synthesis_agent, tools=[], verbose=False)

async def synthesize(topic: str):
    results = await run_parallel_agents(topic)
    combined_input = (
        f"Topic: {topic}\n\n"
        f"Research:\n{results['research']}\n\n"
        f"Analysis:\n{results['analysis']}"
    )
    final = await synthesis_executor.ainvoke({"input": combined_input})
    return final["output"]
  1. Run the pipeline from an async entry point. Keep the top-level call small and explicit so it’s easy to test and easy to move into an API endpoint or background worker later.
if __name__ == "__main__":
    topic = "How banks should monitor AI-generated customer communications"

    async def main():
        output = await synthesize(topic)
        print(output)

    asyncio.run(main())

Testing It

Run the script and confirm that both agents return output without waiting on each other serially. If you add timestamps around each coroutine call, you should see total runtime closer to the slower single request rather than the sum of both requests.

Test with a topic that produces non-trivial output so you can verify the synthesis step is actually combining two distinct responses. If one agent fails, asyncio.gather() will surface the exception immediately unless you explicitly change error handling.

For production-style validation, log each agent’s raw output separately before synthesis. That makes it obvious whether a bad final answer came from weak individual outputs or from poor combination logic.

Next Steps

  • Add real tools to each agent using @tool functions for search, database lookup, or policy retrieval.
  • Replace AgentExecutor with LangGraph when you need durable state, retries, branching, and checkpoints.
  • Add structured outputs with Pydantic models so your parallel agents return typed JSON instead of free text.

Keep learning

By Cyprian Aarons, AI Consultant at Topiax.

Want the complete 8-step roadmap?

Grab the free AI Agent Starter Kit — architecture templates, compliance checklists, and a 7-email deep-dive course.

Get the Starter Kit

Related Guides