SUMM

Damien Murphy introduces himself and outlines his experience in full-stack development, solutions engineering, and AI agents.
Workshop focuses on building a multi-agent system using A2A (Agent-to-Agent) protocol and MCP (Model Context Protocol) to automate business processes.
Bench Computing, Murphy's company, is developing an autonomous AI agent aimed at teams and enterprises, supporting parallel task automation.

A2A allows agents to communicate remotely over the web, facilitating agent specialization (many agents focused on single tasks) and task delegation.
MCP acts as a standard interface for AI agents to access and use tools/resources; described as the "USBC for AI."
Zapier's MCP implementation gives access to around 7,000 tools; overall, MCP supports 10,000+ tools.
MCP simplifies integration, removing the need for custom API work, and enables plug-in architectures.

Use MCP to integrate with external tools and resources, especially when you lack direct control or want to scale integrations.
Use A2A for communication between unrelated, remote agents outside your codebase or organization.
If you have full control of your tools and agents, local function calls are recommended for simplicity and speed.

MCP may only partially meet integration needs, as you can only use what third-party servers expose.
Complexity, unpredictable capacity, and lack of detailed insight are common with remote A2A agents.
Integration issues, such as silent failures and mismatches between expected tool behaviors and actual results, can occur (example with Slack channel naming).

The provided repo contains implementations for the host agent, sub agents (Slack and GitHub), A2A server/client, and MCP client.
Set up involves configuring Zapier MCP, a Gemini API key, and integrating with Slack/GitHub.
Host agent coordinates sub agents, demonstrating division of labor between orchestration (central host) and specific automated tasks (sub agents).

Host agent plans and delegates incoming tasks (e.g., meeting transcript analysis) to sub agents.
Slack agent posts messages based on identified topics or feature requests; GitHub agent creates issues based on detected bugs.
Remote agents (like Bench) can handle comprehensive research tasks, though A2A makes describing multi-capable agents challenging.
Genkit, used for orchestration, currently limits parallel sub-agent calls to five.

Demo shows host agent orchestrating sub agents in parallel, managing incoming webhooks, and relaying outputs (Slack and GitHub actions).
Agent logs and dashboards illustrate process flow and highlight the importance of error detection and handling.
Debugging is more complex with remote agents, as logs may be inaccessible.

Sub agents process large raw datasets, returning only summaries to the host to keep context windows small and efficient.
Tool and agent context growth is a major scaling concern; context management and pruning strategies are critical for controlling cost and performance.
Prompt caching can reduce recurring context costs but must be managed carefully for optimal results.

Authentication is enforced via headers, OAuth, or server-side policies; users log in and authorization is enforced based on their access.
For high-security/regulated environments (finance, defense), it is advised to run LLMs internally and avoid external MCP/A2A agents, using encryption, VPC isolation, and mutual TLS.
Security measures (e.g., whitelisting, managed endpoints) are not dictated by A2A/MCP protocols but by organizational posture.

Observability is mostly custom-built; standard agent ops tools may not accommodate deeply composable or dynamic sub-agent architectures.
Testing uses demo accounts or synthetic data to avoid polluting live systems; agents interact with external tools in safe, non-production environments.
Human-in-the-loop functionality is being explored by modeling humans as agent "tools," allowing agents to message humans directly for business-critical input.

REST APIs suffice for systems entirely within your control, offering state management in the application layer rather than in agent contexts.
MCP/A2A provide state, context windows, and interoperability for more complex, distributed, or third-party integrated workflows.
When using MCP, care must be taken with tool selection and scope to avoid context bloat and unnecessary data exposure.

Directed acyclic graphs (DAGs) are suggested for planning multi-step, branching workflows with dependencies among agents (similar to CI/CD pipelines).
Developers must guide orchestration and context slicing via prompt design and host agent logic; sub-agents generally receive concise, targeted tasks.
Experiments suggest that agent non-determinism and context management remain open challenges, impacting both reproducibility and cost.

MCP is more mature and widely integrated than A2A, but both are evolving—Bench currently uses only MCP.
A2A is expected to grow as more enterprise partners (e.g., Salesforce) release compatible agents.
Future direction includes further abstraction of tool use, better observability, and richer human-agent collaboration paradigms.

Key areas of audience interest: integrating security/compliance controls, improving orchestration reliability, testing strategies, and efficient scaling.
Feedback and community input are encouraged; Bench is offering early access and welcomes collaboration.
Rapid evolution of these protocols means best practices are still forming; adaptability and continuous evaluation are recommended.

A2A & MCP Workshop: Automating Business Processes with LLMs — Damien Murphy, Bench