Everything you need to know about GPT-5 (+ mini and nano) Introduction & Sponsor 00:00
GPT-5 has been released, accompanied by additional information about its benchmarks, pricing, model options, supported tools, and implications for developers.
The presenter references a prior video where they had early access to the model and have now transitioned to paying for usage.
Daytona is featured as the sponsor, offering inexpensive, stateful AI infrastructure with robust SDKs for deploying and managing AI agents.
Livestream and First Impressions 02:30
The official GPT-5 launch livestream was criticized for poor presentation and confusing data visuals.
Despite underwhelming presentation, the model itself is a significant leap, not just a minor improvement.
Pricing & Model Cost Comparison 03:40
GPT-5 pricing is $1.25 per million input tokens and $10 per million output tokens.
Previous models like GPT-3.5 and GPT-4 had higher rates: GPT-3.5 was $10 in/$40 out, GPT-4 was $15 in/$60 out.
Claude Opus and Gemini models are priced significantly higher than GPT-5.
Real-world costs can differ from token prices, as token generation efficiency varies by model.
Benchmark tests show GPT-5 is much cheaper per run compared to Grok 4 and offers better efficiency.
Other Model Options: Mini and Nano 05:46
GPT-5 Mini is priced at $0.25 per million input tokens and $2 per million output tokens, cheaper and smarter than Gemini 2.5 Flash.
GPT-5 Nano is $0.05 per million in and $0.40 per million out, making it attractive for cost-sensitive tasks.
Token caching offers a 90% input cost discount for reused tokens.
Bulk pricing options are available, reducing large-scale operation costs further.
Compared to Gemini 2.5 Pro and Flash, GPT-5 offerings are more competitive, especially at larger context sizes.
Context Length & Cutoff Date 07:11
GPT-5 supports a 400,000-token context window for input and can output up to 128,000 tokens per request.
The official data cutoff is September 30, 2023.
Access, User Experience, & Features 07:49
GPT-5 is available on T3 Chat under the standard tier, with Mini and Nano available in the free tier and the reasoning (thinking) model in premium.
New interface features, like an option to skip the longer "thinking" phase for a quicker answer, are being introduced on chatgpt.com.
Unified System & Model Routing 09:47
GPT-5 uses a unified system that routes queries to different models (smart/fast, reasoning, etc.) based on complexity, tool use, and prompt intent.
This dynamic routing is based on real-time data, including user preferences and actions.
The system resembles mixture-of-experts designs but operates with higher-level model routing.
The presenter notes OpenAI's leadership in this routing and recommendation system innovation.
Benchmarks & Real-World Performance 11:38
On Skatebench and other independent benchmarks, GPT-5 and GPT-5 Mini scored exceptionally well, outperforming previous mini models.
GPT-5 combines high performance with significantly lower costs compared to Grok 4, GPT-3.5, and GPT-4.
The model generates fewer tokens on average, contributing to cost efficiency.
Benchmarks confirm superior instruction-following abilities.
Model Tiers & Replacements for Older Models 12:38
OpenAI provided guidance for which GPT-5 variants should replace existing models (e.g., GPT-4, 3.5, etc.) for different use cases.
GPT-5 Main replaces GPT-4, while GPT-5 Mini and Nano replace earlier mini and flash models.
The trainer describes improved data filtering and safety in the new models, with less regurgitation of content and more summarization and abstraction.
Safety, Alignment, and Refusals 14:41
GPT-5 is trained with advanced safety techniques focused on safe completions instead of binary refusals, especially for dual-use prompts (e.g., biology, cybersecurity).
Agentic alignment tests (e.g., blackmail avoidance, lethal intent avoidance) show GPT-5 scores zero harmful actions, outperforming previous models.
Disallowed content and sycophancy (overly agreeable or enabling responses) have been significantly reduced, addressing past controversies with GPT-4.
New instruction hierarchy ensures that system, developer, and user prompts are prioritized for safety and reliability.
Hallucinations, Deception, & Reliability 19:45
GPT-5 significantly reduces hallucinated/incorrect information: "thinking" mode is under 5% major incorrect claims versus over 20% in GPT-3 and GPT-4.
Deception rates are also much lower: 10x lower in missing image tests, 6x lower in broken tool testing, and 2-3x lower in code deception.
The model is also improved in health advice accuracy and support for other languages.
Real-World Hacker & Security Community Feedback 21:16
Security professionals are impressed with GPT-5's capabilities, especially in reverse engineering and finding obscure information.
Model performance in real-world tough questions shows significant progress over previous AIs.
Long Context Reasoning and Benchmarks 22:02
GPT-5 occupies top positions in long-context reasoning benchmarks and can control token usage/cost by choosing minimal, low, medium, or high-effort variants.
The model scales effort and intelligence according to the needs and cost constraints of the user.
Coding, UI Generation, & SVG Tasks 24:40
GPT-5's performance in generating UIs from screenshots is still developing; results are functional but not accurately representative.
For SVG generation tasks (like "Pelican riding a bike" benchmarks), GPT-5, Mini, and Nano produce competent results, surpassing many other models.
Final Impressions & Recommendations 25:55
GPT-5 is described as a groundbreaking model that is cost-effective, highly competent, and reliable for a range of tasks.
The presenter highlights they no longer feel the need to constantly switch models, as GPT-5 consistently delivers strong results.
Public access is available through platforms like T3 Chat, ChatGPT, and Cursor, with limited free access during launch week.