SUMM

The talk begins with an overview of search technology, starting with Google's launch in 1998 and its innovation of PageRank to rank results by authority.
The presenter notes that in 2021, large language models like GPT-3 enabled queries with deeper understanding of user intent, highlighting a gap between old search technologies and new AI capabilities.
Google’s search algorithm still relies heavily on keyword matching, which fails to handle complex user queries effectively.

The speaker describes dedicating years to developing a search engine that truly understands user queries and web content at a deep, semantic level.
Traditional search engines index web pages based on keywords, while the new approach uses transformer models to create embeddings that capture document meaning, ideas, and references.
Embeddings allow for more nuanced, meaningful search results, especially for complex and long-form queries.
The company Exa built a new type of search engine leveraging these neural embeddings, enabling it to handle complex, multi-paragraph queries and provide more relevant results than traditional search.
The launch generated excitement as users could now make advanced queries that previous search engines couldn't address.

The emergence of ChatGPT and other LLMs changed how people engage with online information, raising questions about the relevance of search engines.
It is argued that LLMs, despite their size, cannot store the entire web or keep up with its constant updates; for example, GPT-4’s parameters can only represent a small fraction of the web’s data.
Ongoing web search is necessary since LLMs lack real-time and comprehensive web coverage.

Traditional search engines were designed for humans, optimizing for simple keyword queries, clickable links, and a limited interface.
AI agents have very different requirements—they can process complex, expansive queries, seek large quantities of data, and are not limited by UI or human attention span.
Traditional search tools and API endpoints are mismatched with AI needs, which prompted Exa to design a search engine specifically to serve AI agents.

AI agents need precise and controllable information, demanding exact responses to their queries instead of results optimized for human clicks.
They can search using extensive context and multi-paragraph queries, which traditional search engines are not designed to handle.
AI agents seek comprehensive datasets—possibly thousands of results—to analyze or summarize, a capability beyond traditional search, which typically returns only the most clickable links.
The possible space of queries dramatically expands in the AI era, as AIs can ask for highly specific, filtered, or structured information previously unthinkable.

Exa aims to provide a single API capable of handling keyword, semantic, and complex queries, thereby serving all information needs of AI agents.
The system is designed for full flexibility and control, allowing users or AI agents to set parameters like result count, date ranges, domains, and search type (neural or keyword).

A brief tour of Exa’s search dashboard shows various API filters and search options, highlighting its customizability for AI-driven applications.
The speaker demonstrates building an agent ("Mark") that converts query results into markdown and illustrates the use of neural and keyword search methods.
An example shows combining neural and keyword searches to find GitHub profiles of engineers in San Francisco interested in information retrieval.
Exa’s API is showcased as capable of scaling up to deliver hundreds or thousands of results for enterprise needs.

Exa has recently launched a research endpoint that orchestrates multiple searches and LLM calls to assemble comprehensive reports or structured outputs on demand.
This is presented as a state-of-the-art tool for deep research tasks powered by AI and neural search integration.

Building a Smarter AI Agent with Neural RAG - Will Bryk, Exa.ai