Frank Liu from Voyage AI (now part of MongoDB) discusses AI-powered search and retrieval, focusing on embedding models and rerankers for RAG and semantic search.
Voyage AI's tools are also used for applications like classification and clustering.
The presentation will cover a refresher on AI search, real-world applications with key lessons, and the future of the field.
AI-powered search finds related concepts even without identical wording, going beyond traditional methods like TF-IDF or BM25.
It understands user intent, allowing for more nuanced recommendations (e.g., suggesting "get well baskets" for a sick friend query).
It can perform some level of reasoning and instruction following.
Retrieval Augmented Generation (RAG) is a popular use case, preventing LLM hallucinations and providing grounded responses by incorporating AI search.
Embedding quality is a core component, with 95% to 99% of systems using embeddings to convert unstructured data (text, PDFs, etc.) into a semantic space for relevant document retrieval.
Chatting with your Codebase: Applications like continue.dev demonstrate that there is no one-size-fits-all embedding model or LLM; extensive evaluation is crucial to find the best fit for specific applications (e.g., Voyage Code 3 for code-related tasks).
Structured Data Integration: Embeddings alone are often insufficient for powerful search systems; incorporating structured data (e.g., filters for legal documents by state or type) is essential for robust retrieval.
Agentic Retrieval and Feedback Loops: AI search systems are moving beyond simple input-output; they often involve feedback loops where LLMs expand or decompose queries (e.g., breaking down a Q4 earnings query into Q1-Q4).
The era of AI agents (2025-2026) will require search systems to be powerful at handling conversational data.
The future of AI search is 100% multimodal, involving the ability to understand and embed combinations of images, text, and audio into a single semantic space.
Instruction tuning and reasoning will play a huge role, allowing users to steer vector retrieval with specific instructions beyond just queries (e.g., "find documents that only dive into detail about this particular aspect").
The concept of an "agent-native database" is emerging, aiming to consolidate multiple search and retrieval components (embedding, re-ranking, query augmentation/decomposition) into a single, unified data platform.