The GPT-5 launch presentation by OpenAI was perceived as overly staged and awkward, failing to capture the natural style of previous live streams.
Minor details like Sam Altman’s footwear and presentation mishaps (e.g., benchmark slides with numerical inconsistencies) were highlighted and criticized by the community.
The critique includes that slides might have benefited from being checked by their own models.
GPT-5 represents more of a system or ecosystem, not a single model; it uses a model router to allocate prompts to the best-fit sub-model based on context and complexity.
There is a distinction between "reasoning" and "non-reasoning" models within GPT-5, optimizing both speed and cost for different query types.
The router’s main purpose is to save costs, especially important for OpenAI’s large user base (around 700 million users).
GPT-5 can perform agentic operations like testing and feeding back on its own code outputs, suggesting an internal use of tools (agentic loop) to improve results, especially useful for coding and math tasks.
There is skepticism about the benchmarks used by OpenAI, noting some benchmarks are saturated and some results are selectively reported (e.g., omitting difficult test instances).
GPT-5’s benchmark performance is good but not leading; it lags behind some competitors (such as Grok 4) in areas like the ARC challenge.
Previous incidents where OpenAI’s reported results were later clarified due to contamination of pre-training data are mentioned.
The system consists of several variants: main GPT-5 system (possibly with two+ internal models), GPT-5 Mini, GPT-5 Nano, and others.
Models are notably faster and less costly, possibly due to lower precision training (e.g., FP4) and efficient compute usage, although this is speculative.
Faster, cheaper models are seen as a significant advantage, especially for tasks like coding and running agentic tools.
GPT-5 does not support audio inputs or real-time capabilities at launch, though these may come in future versions.
Mini and Nano variants offer even lower prices and similar context windows but with earlier knowledge cutoffs and no audio or real-time support.
The pricing structure undercuts most competition (e.g., Claude Opus, Sonnet, Gemini 2.5 Pro) and is particularly aggressive with Mini and Nano variants.