Teaching a human is an iterative, experience-based process; LLMs can only be "taught" through repeated instructions, which is ineffective for complex skills
Current reinforcement learning (RL) techniques are not adaptive and personalized enough to replace on-the-job learning in humans
Human editors improve by noticing nuances and iterating based on audience feedback; LLMs miss out on this tacit knowledge accumulation
The idea of LLMs developing their own RL environments to practice weaknesses is conceivable but far from reality and difficult to generalize
LLMs can sometimes improve within a single session, but this learning is lost once the context window resets
Techniques like rolling context summaries (e.g., Cloud Code) may work for text-based fields, but are brittle elsewhere—important learnings are often omitted from summaries
Without the ability to retain and build on long-term experiences, LLMs can't perform as reliably as humans in evolving workflows
Even absent further algorithmic breakthroughs, AIs capable of on-the-job learning and knowledge sharing across instances could rapidly approach superintelligence
The path to continual learning is expected to be incremental, with imperfect versions appearing before humanlike learning is achieved
Some researchers predict highly capable autonomous agents (e.g., doing complex tax prep) by the end of next year, but the speaker is skeptical
The AI agent scenario described involves automating end-to-end, multi-step processes involving diverse data and human communications—a leap from current abilities
Challenges include: the necessity for long sequential rollouts, limited multimodal training data, and greater computation demands for images/videos
Most available pre-training data is insufficient for truly reliable agentic behavior; generating realistic practice data for agents remains an open issue
Even seemingly simple innovations (like RL approaches for math/coding) took years to implement; tackling broad computer use tasks is therefore even harder