5 Multi-Agent Orchestration Patterns with Azure AI Foundry
A single AI agent answering questions is impressive. A single AI agent completing a complex, multi-step project — autonomously routing tasks, delegating to specialists, tracking progress, and knowing when it is done — is an entirely different category of system.
That second category is what multi-agent orchestration makes possible.
This article walks through five production-ready orchestration patterns, all built on the Azure AI Foundry SDK (@azure/ai-projects) in Node.js. Each pattern solves a different coordination problem, and knowing which one to reach for is the difference between building an AI feature and building an AI system.
The full working source for all five patterns is on GitHub: azizjarrar/azure-multi-agent-orchestration-patterns
Contents
- •Why Single-Agent Systems Break Down
- •Foundation: Azure AI Foundry Setup
- •Authentication: Keyless with DefaultAzureCredential
- •Pattern 1: Sequential Orchestration
- •Pattern 2: Concurrent Orchestration
- •Pattern 3: Handoff Orchestration
- •Pattern 4: Group Chat Orchestration
- •Pattern 5: Magentic Orchestration
- •Pattern Comparison
- •What to Watch Out For
Why Single-Agent Systems Break Down
A single AI agent is a brilliant generalist. Give it a clear, bounded task — "summarize this document", "write a unit test for this function" — and it performs well.
But real-world tasks are rarely clear and bounded. Consider a content marketing pipeline: you need a researcher to gather facts, a writer to draft an article, and an editor to refine it. These are three distinct roles with different expertise, different instructions, and a specific execution order. Stuffing all three into one system prompt creates a messy compromise — the agent context bloats, the roles blur, and output quality drops.
Or consider a customer support system that handles billing, technical, and account questions. You could build one mega-agent with instructions for everything. Or you could build three specialists and a routing layer that sends each request to the right one.
Multi-agent systems solve this by decomposing work into focused units. Each agent has a single responsibility, clear instructions, and operates at the right point in the workflow. The orchestrator — whether a simple loop or a sophisticated planning engine — is what ties them together.
Azure AI Foundry's SDK gives you the building blocks to construct these orchestrators yourself, with full control over how agents communicate, what state they share, and how decisions are made.
Foundation: Azure AI Foundry Setup
Before writing orchestration code, you need an Azure AI Foundry project with a deployed model. This takes about five minutes.
1. Create a Hub and Project
Go to ai.azure.com and sign in.
- •Click New project
- •Select or create a Hub — the top-level resource that holds billing, networking, and access control
- •Name your project and click Create
2. Deploy GPT-4.1
Inside your project, go to My assets → Models + endpoints → Deploy model.
- •Select gpt-4.1
- •Set the deployment name — this is what goes in your code's
modelfield, so keep it clean (e.g.gpt-4.1) - •Click Deploy
3. Get Your Connection String
In your project's Overview tab, find the Project connection string. It looks like:
eastus.api.azureml.ms;00000000-0000-0000-0000-000000000000;my-resource-group;my-project
Add it to your .env:
AZURE_AI_PROJECT_CONNECTION_STRING="eastus.api.azureml.ms;..."
AZURE_OPENAI_DEPLOYMENT_NAME="gpt-4.1"
4. Grant Yourself Access for Keyless Auth
All five examples use DefaultAzureCredential — no hardcoded API keys. For this to work locally, your Azure account needs the Azure AI Developer role on your AI Foundry project.
In the Azure Portal, go to your AI Foundry resource → Access control (IAM) → Add role assignment → assign Azure AI Developer to your account.
Then authenticate locally:
az login
Authentication: Keyless with DefaultAzureCredential
Every example in this repository uses the same authentication setup. It is the correct pattern for any serious Azure AI application.
import { AIProjectClient } from "@azure/ai-projects";
import { DefaultAzureCredential } from "@azure/identity";
import "dotenv/config";
const client = new AIProjectClient(
process.env.AZURE_AI_PROJECT_CONNECTION_STRING,
new DefaultAzureCredential()
);
const chatClient = client.inference.azureOpenAI();
const MODEL = process.env.AZURE_OPENAI_DEPLOYMENT_NAME;
DefaultAzureCredential walks through a chain of authentication methods automatically. Locally it picks up your az login session. In production on Azure Container Apps, App Service, or a VM, it uses the resource's Managed Identity — no secrets to rotate, no API keys to leak, no accidental exposure in environment variable dumps.
client.inference.azureOpenAI() returns an OpenAI-compatible client scoped to your AI Foundry project. All five orchestration examples use this same chatClient and model. The only thing that varies across patterns is how agents are composed and coordinated around that client.
Pattern 1: Sequential Orchestration
File: sequential_orchestration/sequential_example.js
What It Does
Sequential orchestration runs agents in a fixed order, where each agent builds directly on the previous agent's output. It is the simplest and most predictable pattern — a linear pipeline with no branching, no dynamic selection.
The classic use case is a content creation pipeline: Researcher → Writer → Editor. The Researcher provides facts, the Writer drafts from those facts, and the Editor refines the draft. No agent needs to know about the others — they each respond to the growing conversation history that the orchestrator maintains.
The Orchestrator
class SequentialBuilder {
constructor(agents, chatClient, model) {
this.agents = agents;
this.chatClient = chatClient;
this.model = model;
}
async *run_stream(task) {
const conversationHistory = [{ role: "user", content: task }];
for (const agent of this.agents) {
const response = await this.chatClient.chat.completions.create({
model: this.model,
messages: [
{ role: "system", content: agent.instructions },
...conversationHistory,
],
});
const output = response.choices[0].message.content;
conversationHistory.push({ role: "assistant", content: output });
yield { agent: agent.name, output };
}
}
}
Key detail — conversation history as the handoff mechanism: No custom data transfer logic is needed. Each agent receives the full conversationHistory, which already contains every prior agent's response. The Writer sees the Researcher's findings because they live in the message array. The Editor sees both. The growing context is the handoff.
Key detail — async generator (async *): The orchestrator yields events as each agent completes rather than waiting for the full pipeline to finish. This lets the calling code display incremental output and handle results one step at a time:
for await (const event of builder.run_stream(task)) {
console.log(`\n[${event.agent}]`);
console.log(event.output);
}
Defining the Agents
const agents = [
{
name: "Researcher",
instructions: `You are a research specialist. Gather relevant facts, statistics,
and expert insights on the given topic. Be thorough — cite key points that a
writer would need to craft an informative article.`,
},
{
name: "Writer",
instructions: `You are a professional content writer. Using the research provided,
craft a well-structured, engaging article with clear headings, smooth transitions,
and an accessible tone suitable for a general audience.`,
},
{
name: "Editor",
instructions: `You are a senior editor. Review the article and improve clarity,
flow, grammar, and overall quality. Ensure it is polished and publication-ready.`,
},
];
When to Use It
- •Content generation pipelines: research → write → edit → translate
- •Data processing workflows where each step transforms the previous output
- •Any multi-stage process with a fixed, known execution order
- •When you need predictable, auditable execution — you always know exactly which step ran and what it produced
What the Output Looks Like
Task: Write an article on "The impact of artificial intelligence on modern healthcare"
[Researcher]
Key findings: FDA approved 521 AI-enabled medical devices in 2023 (up from 6 in 2015).
Diagnostic AI achieves 94.5% accuracy in detecting diabetic retinopathy vs 91.3% for
ophthalmologists. AI drug discovery cut average development timelines from 12 years to
under 4 in recent trials. Notable: NHS deployed AI triage tools that reduced A&E wait
times by 23% in pilot trusts...
[Writer]
## The Quiet Revolution in Your Doctor's Office
When Dr. Sarah Chen reviews an MRI scan today, she has an unusual colleague looking
over her shoulder — one that has studied more images than any radiologist alive.
Artificial intelligence is not replacing physicians. It is making them faster,
more accurate, and ultimately more effective...
[Editor]
## The Quiet Revolution in Your Doctor's Office
When Dr. Sarah Chen reviews an MRI scan today, an unusual colleague looks over her
shoulder — one trained on more images than any living radiologist. Artificial
intelligence isn't replacing physicians; it's sharpening them. Faster diagnoses.
Fewer missed findings. More time for the human work that machines cannot do...
Pattern 2: Concurrent Orchestration
File: concurrent_orchestration/concurrent_example.js
What It Does
Concurrent orchestration runs all agents in parallel on the same task and collects their independent results. Where Sequential is about building on previous work, Concurrent is about getting multiple expert perspectives simultaneously — with no agent influenced by what the others say.
The use case: you are evaluating a new product launch. You want a market researcher's analysis, a marketer's positioning strategy, and a legal team's risk assessment — all at once, in the time it takes to run one.
The Orchestrator
class ConcurrentBuilder {
constructor(agents, chatClient, model) {
this.agents = agents;
this.chatClient = chatClient;
this.model = model;
}
async *run(task) {
const agentPromises = this.agents.map((agent) =>
this.chatClient.chat.completions
.create({
model: this.model,
messages: [
{ role: "system", content: agent.instructions },
{ role: "user", content: task },
],
})
.then((response) => ({
agent: agent.name,
output: response.choices[0].message.content,
}))
);
const results = await Promise.all(agentPromises);
for (const result of results) {
yield result;
}
}
}
Key detail — Promise.all for true parallelism: All API calls are fired simultaneously. With three agents, total latency is roughly the slowest single response — not the sum of all three. For expensive, time-sensitive analysis tasks, this is a significant throughput advantage over running agents sequentially.
Key detail — no shared context: Each agent receives only the original task and its own system instructions. This is intentional. If the Legal agent sees the Marketer's enthusiasm, it may soften its risk assessment. Independence preserves the integrity of each perspective, which is the entire point of this pattern.
Defining the Agents
const agents = [
{
name: "Researcher",
instructions: `You are a market research analyst. Analyze the market opportunity,
target demographics, competitive landscape, and growth potential.
Provide data-driven insights with specific numbers where possible.`,
},
{
name: "Marketer",
instructions: `You are a marketing strategist. Develop a positioning strategy,
key messaging, target audience profiles, and go-to-market recommendations.
Focus on differentiation and customer acquisition channels.`,
},
{
name: "Legal",
instructions: `You are a legal advisor. Identify potential regulatory hurdles,
liability concerns, intellectual property issues, and compliance requirements.
Flag specific risks the business must address before launch.`,
},
];
When to Use It
- •Multi-perspective analysis where agent independence matters (legal + business + technical reviews run in isolation)
- •Generating diverse content options in parallel — three ad copy variations, three landing page headlines
- •Any workflow where agents do not depend on each other and speed matters
- •A/B testing agent prompts — run the same task across differently-instructed agents and compare output quality side by side
What the Output Looks Like
Task: Analyze launching a budget-friendly electric bike for urban commuters.
[Researcher]
Urban e-bike market projected at $46.4B by 2028 (CAGR 12.6%). Primary segment:
commuters 25-45, households earning $50-100K. Top pain points: last-mile transit gaps,
parking costs, fitness motivation. Price analysis reveals an underserved gap between
$800-$1,500 — most DTC entrants cluster below $800 or above $2,000...
[Marketer]
Positioning: "Your daily commute, reclaimed." Target: environmentally-conscious urban
professionals who have considered e-bikes but found them cost-prohibitive. Primary
channel: LinkedIn + Reddit cycling communities for acquisition. Influencer partnerships
with urban lifestyle creators (50K-500K followers) for awareness. Key message hook:
TCO vs car ownership ($0.06/mile vs $0.67/mile)...
[Legal]
Regulatory exposure: CPSC e-bike classification (Class 1/2/3) dictates where the
product can legally be ridden and at what speed — misclassification creates federal
recall liability. Battery safety compliance (UL 2849 certification) is non-negotiable
for retail partnerships and product liability insurance. EU market requires CE marking
under Machinery Directive 2006/42/EC — budget for a 6-month certification cycle...
Pattern 3: Handoff Orchestration
File: handoff_orchestration/handoff_example.js
What It Does
Handoff orchestration gives agents the ability to transfer control to a specialist when a request is outside their expertise. The routing decision is made by the agent itself — using natural language — rather than by a hardcoded rule in the orchestrator.
The canonical use case: a customer support triage system. A Triage agent receives the initial request. If it determines the issue is technical, it transfers to TechExpert. If it is billing-related, it transfers to BillingExpert. Each specialist handles only what it was designed for, and the conversation history travels with the handoff.
The Routing Convention
Agents are instructed to signal a handoff with a specific phrase:
const agents = {
Triage: {
name: "Triage",
instructions: `You are a customer support triage agent. Your job is to understand
the customer's issue and route it to the correct specialist.
Available specialists:
- TechExpert: hardware issues, software bugs, network problems, device troubleshooting
- BillingExpert: charges, refunds, subscription changes, payment methods
If the issue requires a specialist, respond ONLY with: "TRANSFER TO [SpecialistName]"
If you can resolve it directly, respond with the solution.`,
},
TechExpert: {
name: "TechExpert",
instructions: `You are a senior technical support engineer. Diagnose and resolve
technical issues with detailed, step-by-step troubleshooting instructions.
Explain the likely root cause alongside the fix.`,
},
BillingExpert: {
name: "BillingExpert",
instructions: `You are a billing specialist. Handle all billing, subscription,
refund, and payment-related queries with accuracy and empathy. Always confirm
the resolution clearly before closing.`,
},
};
Key detail — prompt-driven routing: The handoff signal is a plain text convention, not a function call or SDK parameter. The agent decides when to transfer based on its own judgment of the request. This means routing is as sophisticated as the model — it handles edge cases and ambiguous requests that a rules-based router would miss.
The Orchestrator
class AutonomousHandoffWorkflow {
constructor(agents, chatClient, model) {
this.agents = agents;
this.chatClient = chatClient;
this.model = model;
}
async run(task, startingAgent) {
let currentAgent = this.agents[startingAgent];
const conversationHistory = [{ role: "user", content: task }];
while (currentAgent) {
const response = await this.chatClient.chat.completions.create({
model: this.model,
messages: [
{ role: "system", content: currentAgent.instructions },
...conversationHistory,
],
});
const output = response.choices[0].message.content;
console.log(`\n[${currentAgent.name}]: ${output}`);
const transferMatch = output.match(/TRANSFER TO (\w+)/);
if (transferMatch) {
const nextAgentName = transferMatch[1];
currentAgent = this.agents[nextAgentName];
conversationHistory.push({ role: "assistant", content: output });
} else {
break; // Agent resolved the request — no transfer needed
}
}
}
}
Key detail — conversation history follows the handoff: When TechExpert receives control, it sees the full conversation history including the Triage agent's assessment. It knows what was already determined — it does not start from scratch. The customer does not repeat themselves.
Key detail — the loop terminates naturally: The loop continues as long as agents issue transfer signals. The first agent that responds without TRANSFER TO has claimed ownership of the resolution. Multi-hop routing (Triage → TechExpert → SeniorTechExpert) works without any changes to the orchestrator.
When to Use It
- •Customer support routing with multiple specialist tiers
- •Escalation workflows: Tier 1 → Tier 2 → on-call engineer
- •Domain-gated information retrieval: legal questions to a legal agent, technical questions to a technical agent
- •Any workflow where the correct handler is not known until the content of the request is understood
What the Output Looks Like
Issue: My internet keeps disconnecting every 20 minutes like clockwork.
[Triage]: This is a technical connectivity issue — the 20-minute pattern suggests
a DHCP lease renewal failure or a power management override on the network adapter.
TRANSFER TO TechExpert
[TechExpert]: The 20-minute interval is a classic DHCP lease expiration signature.
Here is how to diagnose it:
1. Open Command Prompt as administrator and run: ipconfig /all
2. Note your "Lease Obtained" and "Lease Expires" timestamps
3. If the disconnection aligns with lease renewal, run: netsh int ip reset
If timestamps do not match the disconnection interval, check Device Manager →
Network Adapters → your Wi-Fi adapter → Power Management → uncheck "Allow the
computer to turn off this device to save power". This setting is the most common
cause of periodic disconnections on laptops...
Pattern 4: Group Chat Orchestration
File: group_chat_orchestration/group_chat_example.js
What It Does
Group Chat orchestration simulates a real collaborative discussion between multiple agents. A Manager agent moderates the conversation — it reads the current discussion state and decides which specialist should speak next. Agents build on, challenge, and refine each other's contributions across multiple rounds.
This pattern is designed for open-ended collaborative tasks: brainstorming sessions, design reviews, and strategic planning discussions where the best output emerges from structured discourse rather than a single agent's response.
The Architecture
Group Chat has two distinct layers: the Manager who controls the conversation, and the Participants who contribute content.
const manager = {
name: "Manager",
instructions: `You are a conversation facilitator. Your role is to select the most
valuable next contributor based on the discussion so far.
Review the conversation history and respond with ONLY the name of who should speak
next from the available participants. Do not add any other text.
When the discussion has reached a solid conclusion with actionable outcomes,
respond with exactly: FINISH`,
};
The Orchestrator
class DynamicGroupChatBuilder {
constructor(agents, manager, chatClient, model, maxRounds = 6) {
this.agents = agents;
this.manager = manager;
this.chatClient = chatClient;
this.model = model;
this.maxRounds = maxRounds;
}
async run(task) {
const conversationHistory = [{ role: "user", content: task }];
for (let round = 0; round < this.maxRounds; round++) {
// Step 1: Manager selects the next speaker
const selectionResponse = await this.chatClient.chat.completions.create({
model: this.model,
messages: [
{ role: "system", content: this.manager.instructions },
...conversationHistory,
{
role: "user",
content: `Available participants: ${Object.keys(this.agents).join(", ")}. Who speaks next?`,
},
],
});
const nextSpeaker = selectionResponse.choices[0].message.content.trim();
if (nextSpeaker === "FINISH" || !this.agents[nextSpeaker]) break;
// Step 2: Selected agent contributes to the conversation
const agentResponse = await this.chatClient.chat.completions.create({
model: this.model,
messages: [
{ role: "system", content: this.agents[nextSpeaker].instructions },
...conversationHistory,
],
});
const output = agentResponse.choices[0].message.content;
console.log(`\n[${nextSpeaker}]: ${output}`);
conversationHistory.push({
role: "assistant",
content: `${nextSpeaker}: ${output}`,
});
}
}
}
Key detail — the Manager sees the full conversation: The Manager's selection decision is based on the entire conversation history. After round one, it knows who has spoken and can decide who adds the most value next — selecting a participant who challenged a claim, or one whose perspective has not yet been heard.
Key detail — all participants see the full conversation: Every agent sees the full history when they contribute. This enables genuine discourse: the Brand Strategist can explicitly reference and build on what the Creative Director said two turns prior. It is not simulated discussion — it is agents responding to each other.
Key detail — the Manager terminates early when appropriate: maxRounds prevents infinite loops, but a well-prompted Manager will call FINISH when it judges that the discussion has produced actionable outcomes — before hitting the ceiling. Completion is semantic, not numerical.
Defining the Participants
const agents = {
CreativeDirector: {
name: "CreativeDirector",
instructions: `You are a Creative Director known for bold, memorable ideas. Think
big, challenge conventions, and push for concepts that generate genuine buzz.
Reference real-world viral campaigns when they strengthen your point.`,
},
BrandStrategist: {
name: "BrandStrategist",
instructions: `You are a Brand Strategist who balances creativity with business
objectives. Evaluate ideas for brand alignment, target audience fit, and long-term
brand equity. Keep the conversation grounded in strategic reality.`,
},
PRSpecialist: {
name: "PRSpecialist",
instructions: `You are a PR Specialist focused on media coverage and public
perception. Assess how campaigns will play in press, what journalists will find
compelling, and how to maximize earned media value.`,
},
};
When to Use It
- •Brainstorming sessions where diverse and sometimes conflicting perspectives improve the outcome
- •Design or strategy reviews where agents are expected to challenge each other
- •Simulating expert panels, committees, or multidisciplinary teams
- •Any task where the best answer emerges from deliberation, not from a single agent's reasoning
What the Output Looks Like
Task: Plan a viral marketing stunt for a zero-gravity coffee cup.
[CreativeDirector]: Partner with a space agency for a live "First Coffee in Orbit"
stream from the ISS. The cup performs in zero-g, peaks at 2M concurrent viewers.
The hook: every buyer's name goes on a manifest of "Coffee Explorers." The moment
becomes a cultural artifact, not just a product demo...
[PRSpecialist]: The ISS angle is tier-1 press gold — NASA involvement gives instant
credibility with science, tech, and lifestyle journalists. I'd add a competitive layer:
challenge a rival coffee brand to a zero-g taste test. Brand rivalry drives tabloid
and business press that the ISS partnership alone won't reach...
[BrandStrategist]: The ISS hook earns attention. But the product story needs to
convert it. The message should be "it works in zero-g, it works in your commute."
The stunt is the awareness; the utility angle is what drives purchase intent.
Without that bridge, we get impressions without sales...
[Manager]: FINISH
Pattern 5: Magentic Orchestration
File: magentic_orchestration/magentic_example.js
What It Does
Magentic orchestration is the most sophisticated of the five patterns. It introduces a central Task Ledger — a structured state object that tracks the project's goal, gathered findings, current plan, completed subtasks, and whether the project is finished. A Manager agent reads the ledger each round, makes a structured decision about next steps, delegates to a specialist, and then updates the ledger with what was learned. This loop continues until the project is complete.
This pattern is designed for complex, multi-step projects where the path to completion is not known in advance and the Manager must reason about progress — adapting its plan based on what previous agents discovered.
The Task Ledger
The ledger is the architectural centerpiece. It is the single source of truth for project state across every round:
const taskLedger = {
mainGoal: task,
factsGathered: [],
currentPlan: "",
subtasksCompleted: [],
isProjectFinished: false,
};
Every loop iteration, the Manager receives the full ledger — including everything previously learned — and decides what to do next. This is genuine project memory. The Manager is not re-evaluating from scratch each round; it is reasoning about accumulated knowledge.
The Manager's Structured Response
The Manager is instructed to respond in JSON, making its decisions machine-parseable and auditable:
const manager = {
name: "Manager",
instructions: `You are a project manager coordinating a team of specialists.
You will receive a task ledger tracking the current project state.
Analyze the ledger and decide the next step. Respond ONLY in this JSON format:
{
"updatedPlan": "Your revised strategy for completing the project",
"nextAgent": "Name of the specialist to delegate to next",
"subtaskInstructions": "Specific, actionable instructions for the selected agent",
"reasoning": "Why you selected this agent and this subtask right now"
}
If the project goal has been fully achieved, respond ONLY with:
{
"finalAnswer": "Complete synthesis of all work done and final deliverables"
}
Available specialists: Researcher, MarketAnalyst, StrategyConsultant`,
};
The Orchestrator
class MagenticWorkflow {
constructor(agents, manager, chatClient, model, maxRounds = 6) {
this.agents = agents;
this.manager = manager;
this.chatClient = chatClient;
this.model = model;
this.maxRounds = maxRounds;
}
async run(task) {
const taskLedger = {
mainGoal: task,
factsGathered: [],
currentPlan: "",
subtasksCompleted: [],
isProjectFinished: false,
};
for (let round = 0; round < this.maxRounds; round++) {
if (taskLedger.isProjectFinished) break;
// Step 1: Manager analyzes the ledger and decides the next move
const managerResponse = await this.chatClient.chat.completions.create({
model: this.model,
messages: [
{ role: "system", content: this.manager.instructions },
{
role: "user",
content: `Current Task Ledger:\n${JSON.stringify(taskLedger, null, 2)}\n\nWhat is the next step?`,
},
],
response_format: { type: "json_object" },
});
const decision = JSON.parse(managerResponse.choices[0].message.content);
// Step 2: Check for project completion
if (decision.finalAnswer) {
console.log("\n=== PROJECT COMPLETE ===");
console.log(decision.finalAnswer);
taskLedger.isProjectFinished = true;
break;
}
// Step 3: Update the ledger with the manager's revised plan
taskLedger.currentPlan = decision.updatedPlan;
console.log(`\n[Manager → ${decision.nextAgent}]: ${decision.subtaskInstructions}`);
// Step 4: Delegate to the selected specialist
const agent = this.agents[decision.nextAgent];
const agentResponse = await this.chatClient.chat.completions.create({
model: this.model,
messages: [
{ role: "system", content: agent.instructions },
{ role: "user", content: decision.subtaskInstructions },
],
});
const agentOutput = agentResponse.choices[0].message.content;
console.log(`\n[${decision.nextAgent}]: ${agentOutput}`);
// Step 5: Update the ledger with findings from this round
taskLedger.factsGathered.push(agentOutput);
taskLedger.subtasksCompleted.push(decision.subtaskInstructions);
}
}
}
Key detail — response_format: { type: "json_object" }: This forces GPT-4.1 to respond with valid JSON every time. Combined with a clear schema in the system prompt, it eliminates parsing failures entirely. The Manager's output is always machine-readable.
Key detail — the ledger compounds: Each round, factsGathered and subtasksCompleted grow. By round three, the Manager knows exactly what was researched, what was analyzed, and what gaps remain. It can recognize when the goal has been met rather than running to the max round limit.
Key detail — specialist agents receive only their subtask: Unlike Sequential (where agents see the full history) or Group Chat (where all participants see everything), Magentic delegates narrow, specific instructions to each specialist. The Manager synthesizes the big picture; the specialists execute focused subtasks. This is the separation of concerns that makes the pattern scale.
Defining the Specialists
const agents = {
Researcher: {
name: "Researcher",
instructions: `You are a business research specialist. Gather relevant market
data, industry trends, and factual information. Be specific with numbers
and cite sources when possible.`,
},
MarketAnalyst: {
name: "MarketAnalyst",
instructions: `You are a market analyst. Analyze target customer segments,
competitive positioning, and market entry barriers. Focus on quantitative
insights and identify specific, actionable opportunities.`,
},
StrategyConsultant: {
name: "StrategyConsultant",
instructions: `You are a strategy consultant. Develop clear, actionable strategic
recommendations based on research and analysis provided. Prioritize practical,
implementable steps with explicit success metrics.`,
},
};
When to Use It
- •Complex multi-step projects where the execution path is not fully known upfront
- •Research → analysis → recommendation workflows requiring adaptive planning
- •Building AI systems that reason about their own progress and adjust course mid-project
- •Any task where a transparent, round-by-round audit trail of decisions and findings is valuable
What the Output Looks Like
Task: Create a 3-step growth plan for an organic tea shop expanding to online sales.
[Manager → Researcher]: Research the online organic tea DTC market — size, growth
rate, top competitors, and the acquisition channels they rely on most.
[Researcher]: Online organic tea market: $2.3B in 2024, 8.9% CAGR through 2030.
Top DTC players: Vahdam (raised $23M), Art of Tea, Harney & Sons. Primary
acquisition channels: Instagram (avg 3.2% engagement vs 1.1% industry baseline),
subscription models drive 40% higher LTV vs one-time buyers...
[Manager → MarketAnalyst]: Analyze the subscription model opportunity specifically.
What does CAC look like for a small DTC tea brand, and what tier structure optimizes
for LTV without overwhelming a first-time online buyer?
[MarketAnalyst]: Subscription economics for tea DTC: CAC $28-45 via paid social,
$12-18 via organic/influencer. Subscription LTV: $280 vs $90 one-time purchase.
Recommended tier structure: Taster ($19/mo, 3 samples) → Explorer ($39/mo, full
sizes) → Connoisseur ($69/mo, exclusive blends + early access). Start with Taster
only — reduces friction and improves first-month retention by ~30%...
[Manager → StrategyConsultant]: Synthesize the research and analysis into a concrete
3-step growth plan. Prioritize the Instagram channel and subscription model.
[StrategyConsultant]: 3-Step Growth Plan:
Step 1 — Months 1-3: Launch Shopify store with Taster tier only. 100% Instagram
organic content, 3 posts/week. Goal: 200 active subscribers.
Step 2 — Months 4-6: Introduce Explorer tier + 3 micro-influencer partnerships
($500-$2K/creator, 5K-50K audience). Goal: 500 subscribers, positive unit economics.
Step 3 — Months 7-12: Activate paid social with proven creative assets, launch
Connoisseur tier with exclusive single-origin drops. Goal: 1,500 subscribers, $35K MRR.
=== PROJECT COMPLETE ===
Growth plan synthesized across market research, subscription economics, and channel
strategy. Full 3-step roadmap delivered with milestone targets and sequenced investment.
Pattern Comparison
| Pattern | Control Flow | Agent Awareness | Best For | Latency Profile |
|---|---|---|---|---|
| Sequential | Fixed pipeline | Each agent sees all prior outputs | Content pipelines, step-by-step transforms | Additive — sum of all agents |
| Concurrent | Parallel, no coordination | No agent sees others' outputs | Independent expert analysis, parallel generation | Fastest — max of all agents |
| Handoff | Agent-driven routing | Each agent sees full conversation history | Support triage, escalation workflows | Variable — depends on routing depth |
| Group Chat | Manager-controlled discourse | All agents see full conversation | Brainstorming, collaborative planning | Moderate — 2 API calls per round |
| Magentic | Ledger-driven adaptive loop | Manager sees full ledger; agents see only their subtask | Complex adaptive projects, research synthesis | Highest — multiple planning cycles |
Decision Guide
Is the workflow order fixed and known?
├─ Yes, linear steps that build on each other → Sequential
└─ Yes, independent analyses that run in parallel → Concurrent
Is the correct handler determined by the content of the request?
└─ Yes → Handoff
Is the best answer discovered through deliberation between agents?
└─ Yes → Group Chat
Is the path to completion itself unknown and evolving?
└─ Yes → Magentic
What to Watch Out For
Prompt-driven routing is powerful but brittle at the edges. Handoff and Group Chat rely on agents producing specific text signals (TRANSFER TO, FINISH). If the model paraphrases — "I'll transfer you to..." instead of "TRANSFER TO TechExpert" — the regex match fails silently. Use ONLY in your instructions ("respond ONLY with: TRANSFER TO [Name]") and test your regex against real model output before going to production.
response_format: json_object requires JSON in the prompt. GPT-4.1 will throw an error if you request JSON output mode but your prompt does not explicitly instruct the model to respond in JSON. Always include language like "Respond ONLY in this JSON format" when using this parameter. The schema description in the system prompt is what the model uses to structure its output — be precise about required fields.
Context windows grow in Sequential and Magentic. Every Sequential agent receives the full conversation history. By step five of a ten-step pipeline, the context is substantial. Magentic's factsGathered array grows similarly. For long pipelines, consider summarizing intermediate results before appending them, or windowing the context to the most recent N entries.
Concurrent agents are fully independent by design. If your use case requires any agent to reference another's findings before all have completed, Concurrent is the wrong pattern. Use Magentic instead — its round-based structure lets you gather findings from one agent before deciding what to ask the next.
Rate limits compound in Concurrent orchestration. Three simultaneous API calls produce three times the token throughput at once. If you are running into Azure OpenAI rate limits, implement exponential backoff with jitter, or add a concurrency limiter (e.g. p-limit) when scaling beyond 5-10 parallel agents.
The maxRounds ceiling is a safety net, not a target. If your Group Chat Manager or Magentic Manager regularly hits the round limit without producing a FINISH or finalAnswer signal, the issue is the Manager's instructions — not a round count that is too low. A well-prompted Manager should complete most tasks in 3-5 rounds. Audit what the Manager is producing when it stalls and tighten the prompt before raising the ceiling.
Key Takeaways
Multi-agent orchestration is not a single pattern — it is a family of architectures, each suited to a different coordination problem.
- •Sequential for deterministic pipelines where order matters and each step builds on the last
- •Concurrent for independent parallel analysis where speed and perspective diversity are the goals
- •Handoff for intelligent routing where the correct handler depends on request content
- •Group Chat for collaborative discourse where the best answer emerges from structured deliberation
- •Magentic for complex, adaptive projects where the plan itself evolves as knowledge is gathered
All five share the same authentication foundation — DefaultAzureCredential via AIProjectClient — and the same underlying chatClient.chat.completions.create() call. What varies is only the coordination logic around those calls. This means the patterns are composable: a Magentic workflow can run a Concurrent sub-workflow for its research phase, or a Handoff system can route a request into a Group Chat for collaborative resolution.
The right pattern is determined by one question: how should the agents relate to each other? Answer that, and the architecture follows.
Full source code with working .env.example files and package dependencies for each pattern: azizjarrar/azure-multi-agent-orchestration-patterns
Aziz Jarrar
Full Stack Engineer