SUMM

11X is building digital workers for go-to-market organizations, including Alice (AI SDR) and Julian (voice agent)
Alice functions as an AI sales development representative, responsible for sourcing leads, contacting them, and booking meetings
Alice generates about 50,000 personalized emails daily, compared to 20-50 from a human SDR, and manages campaigns for around 300 organizations
Two key requirements for Alice: knowledge of the seller (products, services, case studies, etc.) and the lead (role, responsibilities, pain points)

Previously, sellers had to manually input their business context via a "library" in the dashboard
The manual process was tedious, created onboarding friction, and resulted in sub-optimal email output
Users either provided too little or too much context, affecting email relevance and AI performance

Shifted from a "push" to a "pull" model: Alice now autonomously gathers relevant seller context
Knowledge base acts as a centralized repository for seller information (marketing materials, case studies, sales calls, press releases, etc.)
Users upload various types of resources (documents, images, websites, audio, video) which are parsed, stored, and referenced for email generation

User uploads resources, stored in S3, then processed asynchronously by selected vendors
Parsed data is stored in the database and upserted into Pinecone (vector DB) as embeddings
Alice queries the vector database during message generation

Converts non-text resources (PDFs, videos, images) to text (markdown), making them LLM-legible
Vendors chosen for parsing: Llama Parse for documents/images, Firecrawl for websites, Cloudglue for audio/video
Selection criteria prioritized support for resource types, markdown output, and web hooks; initially deprioritized accuracy, comprehensiveness, and cost

Breaks down markdown blobs into semantic entities for embedding and retrieval
Uses a combination of splitting by markdown headers, sentences, and tokens to preserve logical structure and prevent overly long chunks

Chose Pinecone for the vector database due to its ease of use, bundled embedding models, and strong customer support
Explored other storage options before settling on vector DB for efficient similarity search

Adopted evolving RAG (retrieval augmented generation) practices, moving from traditional to agentic to deep research RAG
Uses Leta (cloud agent provider) to create a deep research agent that plans, retrieves necessary context, and synthesizes responses for Alice

Introduced interactive 3D visualization of the knowledge base, showing how context is stored and retrieved
Allows users to examine the AI's "brain", clicking on vectors to view associated content, increasing transparency and trust
Integrated with UI to let users upload resources, query Alice, and inspect campaign knowledge in Q&A form

RAG implementation is more complex than expected, with many micro-decisions and technical challenges
Recommend delivering a working production version before benchmarking and optimizing
Leverage vendor expertise and support during development
Upcoming focus areas: tracking hallucinations in emails, evaluating parsing vendors for accuracy/completeness, experimenting with hybrid RAG (graph + vector DB), and cost reduction across the pipeline

The knowledge base significantly improved Alice's functionality and user experience
11X is hiring and encourages interested individuals to reach out

Building Alice’s Brain: an AI Sales Rep that Learns Like a Human - Sherwood & Satwik, 11x