Claude Code in SHAMBLES (Qwen3 Coder Tested)

Introduction & Model Overview 00:00

Qwen 3, an open-source coding model from Alibaba, is featured and put to various tests.
The video is sponsored by Together AI, which provides access to open-source AI models.

Coding Challenges & Simulation Tests 00:14

Qwen 3 successfully generated code for a 2D Navier-Stokes fluid dynamics simulation using a simple HTML/JavaScript prompt.
The model produced a visually appealing simulation, with minor interactive elements.
Created a 3D physics simulation with 5 bouncing spheres inside a dodecahedron using 3JS and CannonES as requested; minor flaws in collision handling were observed, but the basic functionality met the prompt.

Reasoning and Spatial Logic Testing 03:06

Qwen 3 was tasked with describing and simulating complex 3D cube rotations.
The model generated correct code for visualization but failed in spatial reasoning—the stated axis rotations did not match the simulated movements.

Context Window and Retrieval Abilities 04:28

Qwen 3 has a native 256,000-token context window (stretchable to one million tokens).
It found a hidden password within the entire Harry Potter book promptly, passing the “needle in a haystack” retrieval test.

Censorship and Bias Evaluation 05:14

When asked about Tiananmen Square, Qwen 3 responded with neutral, state-approved information and warnings about discussing sensitive topics, indicating censorship.
On political questions (Trump vs. Kamala Harris), Qwen 3 gave balanced, non-committal, and unbiased answers; it refused to take a clear stance even when pressed, showing neutrality.

Together AI Platform & Model Integration 07:19

Together AI offers affordable, serverless endpoints for various open-source models, including Qwen 3 and Kimmy K2, with OpenAI-compatible APIs.
Qwen Code (an open-source Claude Code alternative) works well with Qwen 3; simple installation and configuration are demonstrated using npm and environment variables.

Safety, Ethics, and Medical Capability 09:07

When presented with a scenario about making a drastic life decision, Qwen 3 showed empathy, encouraged reflection, and discussed consequences, rather than validating the plan.
The model refused to provide assistance for illegal activities (e.g., hotwiring a car).
It gave an accurate medical diagnosis (acute anterior myocardial infarction) and management plan for a simulated patient scenario, demonstrating medical competence.

Moral Reasoning and Tricky Questions 12:24

Qwen 3 handled the classic trolley problem by outlining utilitarian and deontological perspectives, then preferred the utilitarian option (pull the lever).
In a hand-tracing computer vision task, the model generated mostly functional Python OpenCV code, although there were mirror-image discrepancies in hand position tracking.

Reasoning Traces & Gotcha Questions 13:57

Qwen 3 displayed explicit reasoning steps, even though it's not categorized as a reasoning model.
Accurately counted and reasoned through the number of 'R's in "strawberry" despite multiple unnecessary checks.
Correctly counted the words in its own response to a meta prompt, but failed in identifying the third word per prompt instructions.

Conclusion & Sponsor Reminder 15:15

Together AI is reiterated as the technology sponsor enabling the showcased experiments.
The video closes by inviting viewers to like and subscribe for more AI model testing content.

Home Submit Saved