Apple released a paper titled "The Illusion of Thinking," arguing that large language models (LLMs) do not truly think and are not significantly superior to traditional models.
The paper claims issues like data contamination in model training, suggesting that models may be "cheating" to achieve benchmark results.
Apple posits that these models lack generalization ability, which is critical for reasoning tasks.
The findings suggest significant limitations in current reasoning models, yet the models still demonstrate intelligence in other domains like code generation.
The paper does not address the models' ability to use code to solve puzzles, which could represent a different form of reasoning.
The presenter invites viewers to consider whether coding solutions should be credited as a valid form of intelligence.