The modified MIT license requires prominent attribution (“Kimmy K2”) in the UI for commercial products above 100 million monthly users or $20 million monthly revenue.
There are concerns about the legal enforceability and open-source compatibility of the license.
The license ambiguity raises questions about how it applies to derivative works and distillations.
DeepSeek V3 inspired the presenter to build T3 Chat, an interface for better user experience with AI models.
DeepSeek R1 was a watershed open model that introduced reasoning by exposing intermediate “reasoning tokens” and methodology, allowing others to train and distill similar models.
Tool Calling in AI Models – The Market Context 16:08
Before DeepSeek R1, only OpenAI’s “01” model offered effective reasoning; similarly, until recently, Anthropic’s Claude models set the standard for reliable tool/function calling.
Tool calling allows AI to trigger external functions for more context-rich and interactive responses.
Anthropic’s Claude 3.5, 3.7, and 4 are benchmarks for tool calling reliability; their monopoly stems from robust accuracy (e.g., 98% tool call accuracy has exponential effects on multi-step tasks).
Despite intelligence, competitors like Gemini and Grok struggle with tool call reliability and adherence to syntax.
Kimmy K2 is the first open model to rival Anthropic’s Claude models in tool calling reliability and agentic capabilities.
Demonstrated success in complex benchmarks (e.g., automatically building 3D scenes, running particle simulations, effective API calls).
Outperforms peers in Minecraft “MCBench” by using tools methodically and without random errors.
Consistently avoids malformed outputs and errors in controlled tests—much higher reliability than previous open models.
Practical Drawbacks and Distillation Potential 28:32
Major drawback: Kimmy K2 is slow—with speeds (tokens per second) significantly below competitors.
Like DeepSeek R1, the large model’s best use may be to generate synthetic data for training smaller, faster “distilled” models.
K2’s capability to output vast amounts of high-quality tool call data could benefit the entire AI model ecosystem by enabling better agentic models via distillation.
The ambiguous license complicates using K2-derivative data/models in commercial products above certain thresholds.
Enforcement and interpretation remain uncertain, especially relating to multi-layered usage (e.g., using a third-party API or distilling data into new models).
Even with these caveats, K2 unlocks large-scale generation of tool call training data—something previously only feasible with access to closed model APIs (like Anthropic's), which are restrictive.
While K2’s slow performance makes it less suitable for direct daily use, its real value lies in accelerating the wider ecosystem’s progress—especially around tool calling in agentic models.
Its open weights should fuel better models and distillations, with broad downstream benefits.
K2 marks a fundamental advance for open models, matching or surpassing closed models in functionality that was previously exclusive.
The expansion of open-weight models will exponentially increase training data and model capabilities for the field as a whole.