The livestream for Grok 4 was delayed by over an hour and the host emphasizes the significance and anticipation for this release
The host plans to make multiple future videos testing Grok 4
Grok 4 is claimed to achieve perfect SAT scores and near-perfect results on GRE and graduate-level exams across diverse disciplines, even with previously unseen questions
Grok 4 is tested on the "Humanity's Last Exam" (HLE), a set of 2,500 challenging, PhD-level problems across mathematics, natural sciences, engineering, and humanities
Prior models scored only single-digit accuracies; Grok 4 achieves much higher
Grok 4 is said to perform at PhD level in every academic subject, although it hasn't yet discovered new physics or invented new technology—a milestone predicted to be reached within one or two years
Addition of native tool-use (like web search, memory) during Grok 4's training significantly improved its capability to use tools
Tool use is considered primitive compared to advanced physical simulation tools used at companies like Tesla or SpaceX, but will be integrated later in the year
Future plans involve providing Grok with accurate physics simulators and ability to interact with the real world via humanoid robots (like Optimus)
Emphasis on entering an "intelligence explosion" era
Grok 4 and Grok 4 Heavy are shown solving academic and real-world tasks (e.g., math problems, World Series prediction, searching through X posts, identifying weird photos)
Unique advantage in real-time data from the X (formerly Twitter) dataset, providing up-to-date and rich information not accessible to competitors
Model demonstrates ability to generate visualizations and simulations (e.g., black hole collisions) using accessible resources, though limited by browser-based computation
Introduction of new, high-quality voice modes ("Sal," "Eve") with improved latency and naturalness, sometimes demonstrated through creative tasks (e.g., singing operas about Diet Coke)
Voice interactions aim for calm and smooth conversational styles, competing with OpenAI's advanced voice mode
Voice model latency has been halved and user base has increased tenfold since launch