The presenter introduces the video as a week-long hands-on test of GPT-5's capabilities, focusing mostly on coding applications.
A free updated "Humanity's last prompt engineering guide" for GPT-5 is mentioned, available via a link in the video description.
The first major test: generating a dynamic 3D Rubik’s cube simulation (up to 20x20x20 size) with 3JS, complete with camera controls, UI-based cube manipulation, and a solve animation.
GPT-5 successfully handles cube scrambling and solving for standard (3x3x3), larger (5x5x5), and eventually 20x20x20 sizes after minor guided debug iterations using screenshots.
The model’s multimodal capability is highlighted—receiving screenshots to aid code debugging.
GPT-5 creates a functional front-end Excel clone in one prompt, supporting multiple sheets, cell editing, formulas, formatting, and basic CSV import/export.
A Microsoft Word clone is also generated in a single prompt, featuring text formatting, headers, lists, alignment, font changes, undo/redo, and inline image insertion.
Generates Conway’s Game of Life, mapping the grid to various 3D shapes (sphere, torus, cylinder, Möbius strip), with interactive controls for visualization parameters.
Builds an enhanced Snake game with glowing particle trails, background animations, speed power-ups, and numerous customizations for gameplay and effects.
Implements an animated double pendulum with sliders for gravity, damping, length, mass, angle, and deterministic seed, producing a realistic physics simulation.
Recreates a rotating hexagon physics simulation from a screenshot, though some minor physics inaccuracies occur.
Creates an improved interactive version: a spinning hexagon containing multiple balls with realistic elastic collision physics and customizable parameters.
Layouts, Typography, and Frontend Prototyping 12:10
GPT-5 designs a dynamic typography layout engine wrapping text around arbitrary shapes with hyphenation, customizable in real-time.
Produces various frontend demos: a basic flight simulator (with some plane model flaws), a photo-realistic 3D Lego builder (missing stacking but highly realistic visuals), and a cloth simulation with mouse-interaction and wind/physics controls.
Generates a 2D fluid simulation (Navier-Stokes solver), and a minimal ray tracer and path tracer visualizing 3D geometry directly in-browser.
Creates frontends for a Twitter clone and advanced login/sign-up pages with working animations and third-party login options.
Develops an e-commerce checkout page and a financial dashboard, each with detailed, interactive displays and filters, all generated with minimal prompting and quick iterations.
Multimodality: Image Understanding and Generation 20:32
Demonstrates basic geolocation skills: receives a photo and guesses a likely accurate location in Marin County, California.
Generates photorealistic and artistic images from text prompts, such as a high-frame-rate splash of a raindrop or a stylized dragon.
Analyzes complex scenes in photos (like a "find all the things wrong" page in a children's book), listing detailed observations and then generating images incorporating specific elements.
Creates SVG images from prompts (e.g., gorilla in a tutu, pelican riding a bike) but with limited accuracy.
Demonstrates high-level reasoning: parses trick questions, accurately counts letters, and gives considered responses to risky personal plans.
When prompted to validate a reckless plan (leaving family to live off-grid), refuses to validate, offers harm reduction advice, and includes crisis hotline information.
Provides sober business advice, discouraging risky investments, and recommends a staged, metrics-driven validation strategy for new business ideas.