Ship it! Building Production Ready Agents — Mike Chambers, AWS

Introduction and Simple Agent Demo 00:03

  • Mike Chambers, a Developer Advocate for AWS, specializes in generative AI and focuses on getting agent code into production at cloud scale.
  • He previously contributed to the "Fundamentals of LLMs" course, which has been taken by over 370,000 people.
  • A simple Python agent, demonstrated running locally on a laptop, uses a Llama 3.1 8 billion parameter model.
  • This agent incorporates a "dice roller" tool and uses a system prompt with examples to interpret natural language commands.
  • In a demonstration, the agent successfully processed a command to "roll for initiative and add a dexterity modifier of five," using its tool to roll a D20 and add the modifier, yielding a result (e.g., 15).
  • The presentation aims to show how to transition this low-scale local agent code to a production-ready, cloud-scale environment.

Anatomy of an Agent 04:51

  • For an agent to operate at cloud scale, several core components are essential:
    • Model: Provides natural language understanding, with pre-existing models making this component relatively straightforward to integrate.
    • Prompt: Defines the agent's purpose, capabilities, and personality.
    • Loop (Agentic Loop): Enables the agent to "think" by processing input, using tools, evaluating outcomes, and deciding if further actions or loops are necessary.
    • History (Conversational History): Crucial for the agent to remember its internal reasoning steps and tool calls within the context of a conversation, ensuring continuity.
    • Tools: Provide the agent with the ability to perform actions and interact with the external world.
  • These elements are considered the fundamental requirements for a minimal viable agent product.

Hosting Agents with AWS Bedrock 07:40

  • AWS offers services designed to host and scale these agent components in the cloud.
  • Models: Amazon Bedrock provides access to models from various leading providers, including Anthropic, Amazon Nova, Meta, Mistral, and AI21 Labs.
  • Amazon Bedrock Agents: This is a fully managed service that eliminates the need for users to manage infrastructure, ensuring agents are cloud-scale.
  • Within Bedrock Agents, the agent's "instruction" (personality/prompt) is configured, while the service automatically handles the agentic loop and conversational history.
  • Action Groups: These are collections of tools that connect the agent to specific functionalities, typically implemented using AWS Lambda functions.
  • Lambda functions are well-suited for hosting these tools due to their inherent scalability and ability to interact with external services or other AWS services.

Building and Testing an Agent Demo 10:35

  • A demonstration showcased building an agent directly in the AWS console, noting that all steps are also achievable via Infrastructure as Code frameworks (e.g., Terraform, CloudFormation, SDK, SAM).
  • The process involved creating and naming the agent, providing a descriptive purpose, selecting a model (e.g., Anthropic Haiku 3.5), and defining its instructions (e.g., "You're a games master...").
  • An action group was then added, linked to a Lambda function (conveniently set up using a quick start option) which housed the "roll dice" tool's logic.
  • Parameters for the tool, such as numberOfSides (specified as a required integer), were defined and described for the LLM's comprehension.
  • Python code for the dice roll, including import random, was inserted and deployed within the Lambda function's console editor, with assistance from Amazon Q Developer.
  • After preparing the agent (which includes setting up alias IDs for a production-ready software development lifecycle), it was tested in the console.
  • The agent successfully responded to a natural language query, confirming its fully hosted and managed cloud-scale functionality.

Conclusion and Resources 18:35

  • Free DeepLearning.AI courses focused on Bedrock agents are available, offering a complimentary AWS environment for hands-on experimentation.
  • The speaker extended an invitation for further discussions on topics such as cloud-scale MCP servers, an open-source SDK for model-first agents, and acquiring real-life D20 dice.