Building Alice’s Brain: an AI Sales Rep that Learns Like a Human - Sherwood & Satwik, 11x

Introduction & Context 00:03

  • 11X is building digital workers for go-to-market organizations, including Alice (AI SDR) and Julian (voice agent)
  • Alice functions as an AI sales development representative, responsible for sourcing leads, contacting them, and booking meetings
  • Alice generates about 50,000 personalized emails daily, compared to 20-50 from a human SDR, and manages campaigns for around 300 organizations
  • Two key requirements for Alice: knowledge of the seller (products, services, case studies, etc.) and the lead (role, responsibilities, pain points)

Challenges with the Previous System 03:09

  • Previously, sellers had to manually input their business context via a "library" in the dashboard
  • The manual process was tedious, created onboarding friction, and resulted in sub-optimal email output
  • Users either provided too little or too much context, affecting email relevance and AI performance

Overview of the New Knowledge Base 05:12

  • Shifted from a "push" to a "pull" model: Alice now autonomously gathers relevant seller context
  • Knowledge base acts as a centralized repository for seller information (marketing materials, case studies, sales calls, press releases, etc.)
  • Users upload various types of resources (documents, images, websites, audio, video) which are parsed, stored, and referenced for email generation

Knowledge Base Architecture & Pipeline 06:36

  • User uploads resources, stored in S3, then processed asynchronously by selected vendors
  • Parsed data is stored in the database and upserted into Pinecone (vector DB) as embeddings
  • Alice queries the vector database during message generation

Pipeline Deep Dive 07:32

  • Five-step pipeline: Parsing, Chunking, Storage, Retrieval, Visualization

Parsing

  • Converts non-text resources (PDFs, videos, images) to text (markdown), making them LLM-legible
  • Vendors chosen for parsing: Llama Parse for documents/images, Firecrawl for websites, Cloudglue for audio/video
  • Selection criteria prioritized support for resource types, markdown output, and web hooks; initially deprioritized accuracy, comprehensiveness, and cost

Chunking

  • Breaks down markdown blobs into semantic entities for embedding and retrieval
  • Uses a combination of splitting by markdown headers, sentences, and tokens to preserve logical structure and prevent overly long chunks

Storage

  • Chose Pinecone for the vector database due to its ease of use, bundled embedding models, and strong customer support
  • Explored other storage options before settling on vector DB for efficient similarity search

Retrieval

  • Adopted evolving RAG (retrieval augmented generation) practices, moving from traditional to agentic to deep research RAG
  • Uses Leta (cloud agent provider) to create a deep research agent that plans, retrieves necessary context, and synthesizes responses for Alice

Visualization

  • Introduced interactive 3D visualization of the knowledge base, showing how context is stored and retrieved
  • Allows users to examine the AI's "brain", clicking on vectors to view associated content, increasing transparency and trust
  • Integrated with UI to let users upload resources, query Alice, and inspect campaign knowledge in Q&A form

Lessons Learned & Future Plans 20:25

  • RAG implementation is more complex than expected, with many micro-decisions and technical challenges
  • Recommend delivering a working production version before benchmarking and optimizing
  • Leverage vendor expertise and support during development
  • Upcoming focus areas: tracking hallucinations in emails, evaluating parsing vendors for accuracy/completeness, experimenting with hybrid RAG (graph + vector DB), and cost reduction across the pipeline

Conclusion & Call to Action 22:03

  • The knowledge base significantly improved Alice's functionality and user experience
  • 11X is hiring and encourages interested individuals to reach out