A2A & MCP Workshop: Automating Business Processes with LLMs — Damien Murphy, Bench

Introduction and Workshop Overview 00:15

  • Damien Murphy introduces himself and outlines his experience in full-stack development, solutions engineering, and AI agents.
  • Workshop focuses on building a multi-agent system using A2A (Agent-to-Agent) protocol and MCP (Model Context Protocol) to automate business processes.
  • Bench Computing, Murphy's company, is developing an autonomous AI agent aimed at teams and enterprises, supporting parallel task automation.

What Are A2A and MCP? 02:22

  • A2A allows agents to communicate remotely over the web, facilitating agent specialization (many agents focused on single tasks) and task delegation.
  • MCP acts as a standard interface for AI agents to access and use tools/resources; described as the "USBC for AI."
  • Zapier's MCP implementation gives access to around 7,000 tools; overall, MCP supports 10,000+ tools.
  • MCP simplifies integration, removing the need for custom API work, and enables plug-in architectures.

When to Use A2A vs. MCP 04:48

  • Use MCP to integrate with external tools and resources, especially when you lack direct control or want to scale integrations.
  • Use A2A for communication between unrelated, remote agents outside your codebase or organization.
  • If you have full control of your tools and agents, local function calls are recommended for simplicity and speed.

Limitations and Challenges with A2A and MCP 07:52

  • MCP may only partially meet integration needs, as you can only use what third-party servers expose.
  • Complexity, unpredictable capacity, and lack of detailed insight are common with remote A2A agents.
  • Integration issues, such as silent failures and mismatches between expected tool behaviors and actual results, can occur (example with Slack channel naming).

Workshop Codebase and Setup 11:02

  • The provided repo contains implementations for the host agent, sub agents (Slack and GitHub), A2A server/client, and MCP client.
  • Set up involves configuring Zapier MCP, a Gemini API key, and integrating with Slack/GitHub.
  • Host agent coordinates sub agents, demonstrating division of labor between orchestration (central host) and specific automated tasks (sub agents).

Agent Roles and Orchestration 15:36

  • Host agent plans and delegates incoming tasks (e.g., meeting transcript analysis) to sub agents.
  • Slack agent posts messages based on identified topics or feature requests; GitHub agent creates issues based on detected bugs.
  • Remote agents (like Bench) can handle comprehensive research tasks, though A2A makes describing multi-capable agents challenging.
  • Genkit, used for orchestration, currently limits parallel sub-agent calls to five.

Demonstration and Debugging 21:19

  • Demo shows host agent orchestrating sub agents in parallel, managing incoming webhooks, and relaying outputs (Slack and GitHub actions).
  • Agent logs and dashboards illustrate process flow and highlight the importance of error detection and handling.
  • Debugging is more complex with remote agents, as logs may be inaccessible.

Context Management and Prompt Caching 32:11

  • Sub agents process large raw datasets, returning only summaries to the host to keep context windows small and efficient.
  • Tool and agent context growth is a major scaling concern; context management and pruning strategies are critical for controlling cost and performance.
  • Prompt caching can reduce recurring context costs but must be managed carefully for optimal results.

Security and Compliance Considerations 30:04, 49:00

  • Authentication is enforced via headers, OAuth, or server-side policies; users log in and authorization is enforced based on their access.
  • For high-security/regulated environments (finance, defense), it is advised to run LLMs internally and avoid external MCP/A2A agents, using encryption, VPC isolation, and mutual TLS.
  • Security measures (e.g., whitelisting, managed endpoints) are not dictated by A2A/MCP protocols but by organizational posture.

Observability, Testing, and Human-in-the-Loop 61:17, 73:02

  • Observability is mostly custom-built; standard agent ops tools may not accommodate deeply composable or dynamic sub-agent architectures.
  • Testing uses demo accounts or synthetic data to avoid polluting live systems; agents interact with external tools in safe, non-production environments.
  • Human-in-the-loop functionality is being explored by modeling humans as agent "tools," allowing agents to message humans directly for business-critical input.

Comparisons and Best Practices: A2A, MCP, REST APIs 78:25

  • REST APIs suffice for systems entirely within your control, offering state management in the application layer rather than in agent contexts.
  • MCP/A2A provide state, context windows, and interoperability for more complex, distributed, or third-party integrated workflows.
  • When using MCP, care must be taken with tool selection and scope to avoid context bloat and unnecessary data exposure.

Advanced Topics: Orchestration, DAGs, and Prompt Strategies 81:08

  • Directed acyclic graphs (DAGs) are suggested for planning multi-step, branching workflows with dependencies among agents (similar to CI/CD pipelines).
  • Developers must guide orchestration and context slicing via prompt design and host agent logic; sub-agents generally receive concise, targeted tasks.
  • Experiments suggest that agent non-determinism and context management remain open challenges, impacting both reproducibility and cost.

Current Maturity and Future Directions 71:15

  • MCP is more mature and widely integrated than A2A, but both are evolving—Bench currently uses only MCP.
  • A2A is expected to grow as more enterprise partners (e.g., Salesforce) release compatible agents.
  • Future direction includes further abstraction of tool use, better observability, and richer human-agent collaboration paradigms.

Q&A and Closing Thoughts 62:26, 83:02

  • Key areas of audience interest: integrating security/compliance controls, improving orchestration reliability, testing strategies, and efficient scaling.
  • Feedback and community input are encouraged; Bench is offering early access and welcomes collaboration.
  • Rapid evolution of these protocols means best practices are still forming; adaptability and continuous evaluation are recommended.