SUMM

Dylan Davis (DS squared) is the founder of Gradient Labs, helping mid-size companies implement AI for automation.
The discussion is a response to recent blog posts by Anthropic and Cognition, published back-to-back, with contrasting views on multi-agent vs. single-agent AI architectures.
Both blog posts are valid in their claims, and the discussion centers around when each approach is appropriate.

Anthropic uses a multi-agent system in their Claude "deep research" feature.
For complex, broad questions (e.g., listing 100+ AI companies), a lead agent decomposes the task into sub-questions assigned to multiple sub-agents.
Sub-agents research in parallel, using extended context windows, and synthesize their answers.
Responses are validated by a dedicated citations sub-agent to ensure cited information matches sources.
The lead agent consolidates all findings and delivers a comprehensive report to the user.
Multi-agent systems offer high-quality, diverse, and comprehensive answers due to distributed research.

Cognition prefers a single-agent approach for many use cases, especially coding tasks.
The single agent takes in a task, creates a plan, and processes sub-tasks sequentially.
This approach avoids bloating the context window, but can be slower and provide less diversity in insights compared to parallelized multi-agent systems.
In research, single-agent architectures use fewer tokens (approx. 4x basic conversation), while multi-agent uses around 15x, making cost a key consideration.
However, multi-agent approaches often outperform single-agent ones by up to 80% in quality metrics.

Anthropic evaluated outputs on five dimensions: factual accuracy, citation accuracy, completeness, source quality, and tool efficiency.
Adjustments were made to penalize low-quality or SEO-gamed sources.
Initially, multiple LLM "judges" were used for evaluation; ultimately, a single LLM judge yielded more consistent results.

Multi-agent architectures excel in research and domains where parallel, independent work is possible and valuable.
Coding tasks often require sequential, closely coupled actions, making single-agent systems preferable due to the need for maintaining consistent shared context.
Parallel agents in code generation may produce incompatible outputs that are hard to reconcile.

Use multi-agent architectures when tasks can be divided into independent parts, diversity of perspectives is valuable, and higher token costs are justified for quality.
Use single-agent architectures when tasks have strong dependencies, require reliability, and context from previous actions must be preserved and passed forward.

Dylan often uses visual artifacts generated by AI (such as Claude) to clarify and communicate architectural choices and reasoning flows.
Visuals are helpful for both understanding and teaching complex AI concepts and workflows.

Many practitioners disregard cost for research agents, anticipating rapid decreases in the price of intelligence and aiming for future-first products.
Prioritizing quality, even at higher token costs, can enable faster user growth and longer-term success, similar to strategies used in tech startups willing to invest heavily for market share.

Emphasis on selecting the agent architecture that fits the problem, not just following industry trends.
Dylan Davis is open to connections on LinkedIn and offers a no-fluff, 30-day AI insights email sequence.
Encouragement for community engagement and featuring diverse voices in AI discussions.

⚡️Anthropic vs Cognition on Multi-Agents: A Breakdown with Dylan Davis