I was wrong about GPT-5

Introduction & Setting Context 00:00

  • The creator explains there is significant confusion and frustration about the GPT-5 launch and clarifies that some misunderstandings are partly their fault.
  • Emphasizes that the public's experience with GPT-5 differs from the original testing experience.
  • Outlines intent to cover personal involvement with OpenAI, compensation, shifting experiences with the model, and issues since launch.

Personal Involvement and Compensation 00:54

  • Denies being paid by OpenAI, stating no payment was received except an offered but not taken $1,000 appearance fee.
  • Discloses personal financial losses (~$25,000) from inference costs related to T3 Chat.
  • Details the process for gaining early access to GPT-5, which was obtained as an individual, not via company channels.
  • Participation in OpenAI's launch video was motivated by interest and the presence of a known peer, not compensation.
  • The benchmarks and demonstrations used were performed via API and Cursor, not through the main website.

Experience Discrepancy Before and After Launch 03:08

  • Admits to a mistake in publishing videos without gauging community feedback, as he was away at Defcon during launch.
  • Early personal experiences with GPT-5 (API/benchmarks) were far superior to what the public experienced at launch.
  • Recognizes that the public used different versions or endpoints of the model, leading to disappointment.
  • Acknowledges that negative experiences seen on chatgpt.com were justified considering the model versions people accessed.

Degraded Model Performance Post-Launch 05:25

  • Notes that initial public feedback reported good performance on day one, which degraded in subsequent days.
  • Finds significant drop in output quality both in Cursor and tools like Copilot, with outputs visually and functionally worse than before.
  • Demonstrates through repeated tests (e.g., UI code generation), showing outputs that became progressively worse, even with identical prompts.
  • Compares GPT-5 negatively to competitors like Opus, noting both perform poorly but GPT-5 was initially superior in some tasks.

Transparency, Timing, and Video Scheduling Issues 09:41

  • Addresses criticism for releasing an anti-Anthropic video too close to the GPT-5 launch, explaining the timing was a result of pre-planned publishing schedules—not intentional coordination with OpenAI.
  • Reinforces that the views on Anthropic were formed months in advance and not influenced by OpenAI.

On Public Backlash and Changing Sentiment 11:16

  • Notes multiple creators and users praised GPT-5 based on their early experiences, now revising their opinions due to deteriorated model outputs.
  • Clarifies that the early GPT-5 was not well-suited for conversational use, being effective in following instructions but robotic.
  • Describes product teams reverting from GPT-5 to GPT-4.1 due to user feedback about slower and less enjoyable outputs.

Misconceptions About Bias and Integrity 13:02

  • Insists that positive commentary was based on genuine early testing, not undisclosed sponsorship or bias.
  • Acknowledges video release prior to understanding widespread negative user sentiment and wishes he could have responded differently.
  • Expresses frustration at being accused of shilling despite transparency and financial loss.

OpenAI's Launch and Product Rollout Mistakes 17:19

  • Highlights the "auto router" as the primary culprit: this system automatically selected (often poorer) model variants based on user queries, leading to inconsistent and often degraded user experiences.
  • Notes that most users received the least powerful version, especially on the free tier, resulting in quality discrepancies.
  • Criticizes OpenAI for hiding model choices from the UI, removing access to other models, and not clearly communicating the differences between model parameters (e.g., nectarines, mini, nano).
  • Points out OpenAI's aim to simplify user experience and reduce costs but states this strategy failed to deliver quality.

Issues with Third-Party Tools and Model Integration 22:05

  • Discusses the lack of robust code tooling for GPT-5 compared to Anthropic/Claude, making integration and benchmarking more difficult.
  • States that GPT-5 requires different prompting and system design versus previous models, and that changes were not clearly communicated to developers.
  • Mentions that although GPT-5 is better than competitors for agentic/code tasks, most users and tools are not prepared for its unique handling.

The Creator's Perspective on Honesty and Community Response 25:23

  • Concludes that OpenAI’s launch was mishandled, leading to user confusion and loss of trust.
  • Expresses frustration over being penalized for honest, authentic reporting, noting that sharing personal experience has come at a financial and emotional cost.
  • Stresses refusing sponsorships and maintaining transparency, but worries about the viability of this approach amid ongoing backlash.
  • Asks for understanding, stating honest intent and the hope for future improvements in model quality and community discourse.