China is leading the advancement of open-source AI models, with the new GLM 4.5 model from ZAI matching top closed-source models in reasoning, coding, and agent capabilities.
Demo of the model simulating a Rubik’s cube—successfully scrambles and solves cubes of increasing difficulty (3x3, 5x5, 10x10).
The model outputs move history and allows custom features, such as setting the number of scrambles.
Demonstrates solving the Tower of Hanoi puzzle, performing deep reasoning without relying on pre-written code, and provides visualized solutions.
The model builds interactive 3JS Lego simulations in a single HTML file, and can build upon previous structures with reasonable accuracy.
Creates a 3D solar system visualization with adjustable settings, tooltips for planetary data, and interactive features like scaling, lighting, and orbit visibility.
GLM 4.5 comes in two versions: the standard (355B total, 32B active parameters) and the "air" version (106B total, 12B active parameters), both using a mixture-of-experts architecture.
These are hybrid reasoning models with “thinking” and “non-thinking” modes for various tasks.
In practical use, GLM 4.5 tends to engage in its thinking mode even with simple prompts.
Benchmark performance places GLM 4.5 very close to leading closed-source models (e.g., Grok 4), outperforming models like Claude 4 Opus.
The smaller “air” version also ranks competitively.
On agentic and tool-use benchmarks, GLM 4.5 exceeds Grok 4 and aligns with other frontier models.
For reasoning benchmarks (MMLU, Math 500), it is above Claude 4 Opus but still trails some models like Deepseek R1 and Gemini 2.5 Pro.
On coding benchmarks, GLM 4.5 ranks near the top—just below Claude.
In terms of parameter efficiency on SWE-bench, GLM 4.5 matches Kimmy K2 in quality but is significantly smaller in size.
The model uses reinforcement learning for post-training agentic capabilities, in line with current state-of-the-art methods.
Demonstrates more model capabilities: Flappy Bird game simulation, accurate 3D Maze Explorer, to-do board interface, animated visualizations (SVG), Python-generated visualizations, and a Pokedex with interactive stats and images.
Revisits the Tower of Hanoi demo with 10 disks; the model provides an algorithmic solution and accurately solves the problem in 1,023 moves, displaying each step at 10 moves per second.
Open-source models like GLM 4.5 have effectively closed the gap with top closed-source AI, at least until the release of GPT-5.
The video ends with a call to support the channel.