SUMM

The creator explains there is significant confusion and frustration about the GPT-5 launch and clarifies that some misunderstandings are partly their fault.
Emphasizes that the public's experience with GPT-5 differs from the original testing experience.
Outlines intent to cover personal involvement with OpenAI, compensation, shifting experiences with the model, and issues since launch.

Denies being paid by OpenAI, stating no payment was received except an offered but not taken $1,000 appearance fee.
Discloses personal financial losses (~$25,000) from inference costs related to T3 Chat.
Details the process for gaining early access to GPT-5, which was obtained as an individual, not via company channels.
Participation in OpenAI's launch video was motivated by interest and the presence of a known peer, not compensation.
The benchmarks and demonstrations used were performed via API and Cursor, not through the main website.

Admits to a mistake in publishing videos without gauging community feedback, as he was away at Defcon during launch.
Early personal experiences with GPT-5 (API/benchmarks) were far superior to what the public experienced at launch.
Recognizes that the public used different versions or endpoints of the model, leading to disappointment.
Acknowledges that negative experiences seen on chatgpt.com were justified considering the model versions people accessed.

Notes that initial public feedback reported good performance on day one, which degraded in subsequent days.
Finds significant drop in output quality both in Cursor and tools like Copilot, with outputs visually and functionally worse than before.
Demonstrates through repeated tests (e.g., UI code generation), showing outputs that became progressively worse, even with identical prompts.
Compares GPT-5 negatively to competitors like Opus, noting both perform poorly but GPT-5 was initially superior in some tasks.

Addresses criticism for releasing an anti-Anthropic video too close to the GPT-5 launch, explaining the timing was a result of pre-planned publishing schedules—not intentional coordination with OpenAI.
Reinforces that the views on Anthropic were formed months in advance and not influenced by OpenAI.

Notes multiple creators and users praised GPT-5 based on their early experiences, now revising their opinions due to deteriorated model outputs.
Clarifies that the early GPT-5 was not well-suited for conversational use, being effective in following instructions but robotic.
Describes product teams reverting from GPT-5 to GPT-4.1 due to user feedback about slower and less enjoyable outputs.

Insists that positive commentary was based on genuine early testing, not undisclosed sponsorship or bias.
Acknowledges video release prior to understanding widespread negative user sentiment and wishes he could have responded differently.
Expresses frustration at being accused of shilling despite transparency and financial loss.

Highlights the "auto router" as the primary culprit: this system automatically selected (often poorer) model variants based on user queries, leading to inconsistent and often degraded user experiences.
Notes that most users received the least powerful version, especially on the free tier, resulting in quality discrepancies.
Criticizes OpenAI for hiding model choices from the UI, removing access to other models, and not clearly communicating the differences between model parameters (e.g., nectarines, mini, nano).
Points out OpenAI's aim to simplify user experience and reduce costs but states this strategy failed to deliver quality.

Discusses the lack of robust code tooling for GPT-5 compared to Anthropic/Claude, making integration and benchmarking more difficult.
States that GPT-5 requires different prompting and system design versus previous models, and that changes were not clearly communicated to developers.
Mentions that although GPT-5 is better than competitors for agentic/code tasks, most users and tools are not prepared for its unique handling.

Concludes that OpenAI’s launch was mishandled, leading to user confusion and loss of trust.
Expresses frustration over being penalized for honest, authentic reporting, noting that sharing personal experience has come at a financial and emotional cost.
Stresses refusing sponsorships and maintaining transparency, but worries about the viability of this approach amid ongoing backlash.
Asks for understanding, stating honest intent and the hope for future improvements in model quality and community discourse.

I was wrong about GPT-5