SUMM

The GPT-5 launch presentation by OpenAI was perceived as overly staged and awkward, failing to capture the natural style of previous live streams.
Minor details like Sam Altman’s footwear and presentation mishaps (e.g., benchmark slides with numerical inconsistencies) were highlighted and criticized by the community.
The critique includes that slides might have benefited from being checked by their own models.

GPT-5 represents more of a system or ecosystem, not a single model; it uses a model router to allocate prompts to the best-fit sub-model based on context and complexity.
There is a distinction between "reasoning" and "non-reasoning" models within GPT-5, optimizing both speed and cost for different query types.
The router’s main purpose is to save costs, especially important for OpenAI’s large user base (around 700 million users).

GPT-5 can perform agentic operations like testing and feeding back on its own code outputs, suggesting an internal use of tools (agentic loop) to improve results, especially useful for coding and math tasks.

Creative writing, health advice, and code generation are emphasized as core improvement areas in this release.
OpenAI’s post-training workforce has expanded, with large teams specializing in specific verticals like health, code, and creative expression.
Commendation for improving health outputs despite legal risk concerns, recognizing demand for health advice and second opinions worldwide.

There is skepticism about the benchmarks used by OpenAI, noting some benchmarks are saturated and some results are selectively reported (e.g., omitting difficult test instances).
GPT-5’s benchmark performance is good but not leading; it lags behind some competitors (such as Grok 4) in areas like the ARC challenge.
Previous incidents where OpenAI’s reported results were later clarified due to contamination of pre-training data are mentioned.

The system consists of several variants: main GPT-5 system (possibly with two+ internal models), GPT-5 Mini, GPT-5 Nano, and others.
Models are notably faster and less costly, possibly due to lower precision training (e.g., FP4) and efficient compute usage, although this is speculative.
Faster, cheaper models are seen as a significant advantage, especially for tasks like coding and running agentic tools.

Main model pricing: $1.25 per million tokens input, $10 per million output; optimized by routing most queries to the cheaper, non-reasoning model.
Context window supports up to 400,000 tokens in and 128,000 tokens out, enabling large-scale tasks like rewriting novels or reviewing large documents.
The knowledge cutoff is October of the previous year, indicating pre-training finished well in advance of release.

GPT-5 does not support audio inputs or real-time capabilities at launch, though these may come in future versions.
Mini and Nano variants offer even lower prices and similar context windows but with earlier knowledge cutoffs and no audio or real-time support.
The pricing structure undercuts most competition (e.g., Claude Opus, Sonnet, Gemini 2.5 Pro) and is particularly aggressive with Mini and Nano variants.

GPT-5 is positioned to replace much of the GPT-4 family due to better performance and lower costs.
Some early users, such as the CEO of Cursor, called it the best coding model so far, though this may be influenced by pricing advantages.
There is user concern about the model router’s impact on reliability and consistency, especially for users wanting always-on "reasoning" mode.
The release is noted as less impressive than GPT-4’s debut, sparking debate on whether cost or raw performance is now more important for users.

GPT 5 - What They Didn't Say