Scaling and the Road to Human-Level AI | Anthropic Co-founder Jared Kaplan

Jared Kaplan's Background and Transition to AI 00:00

  • Kaplan started his career as a theoretical physicist, inspired by science fiction and foundational science questions.
  • He moved from physics to AI out of frustration with slow progress in physics and encouragement from peers about AI’s potential.
  • His involvement with Anthropic began through connections with other founders from his physics background.

Fundamentals of Contemporary AI Training 02:13

  • AI models like Claude and ChatGPT are trained in two phases: pre-training (predicting next words from human data) and reinforcement learning (using human feedback to optimize desired behaviors).
  • Scaling up data and compute in pre-training and RL phases has led to consistent and predictable improvements in AI capability.

Scaling Laws in AI 04:17

  • Scaling laws discovered in both pre-training and RL phases show a direct relationship: increasing compute, model size, and data improves performance over many orders of magnitude.
  • These scaling laws give confidence that AI can continue to advance predictably as resources increase.
  • The discovery of these laws was inspired by applying broad, simple questions from physics to AI.

The Growth of AI Capabilities 08:20

  • AI progress can be visualized on two axes: flexibility (modality handling) and task complexity/duration.
  • Models are progressing from narrow, game-specific systems (e.g., AlphaGo) to flexible, multi-modal, and long-horizon tasks.
  • The length of tasks AI can handle is doubling roughly every seven months, with future models possibly handling projects that would take humans weeks, months, or even years.

Toward Human-Level AI: What’s Left 11:15

  • To reach broadly human-level AI, models need more organizational knowledge (context about how organizations work), long-term memory, and enhanced oversight for nuanced tasks.
  • Improvements in AI memory and contextual understanding are being incorporated into newer models like Claude 4.
  • Progress is required in generating better reward signals for creative and subjective tasks (e.g., humor, poetry).

Preparing for Accelerating AI Progress 13:41

  • Building ambitious products—even if current AI isn’t capable yet—is recommended due to rapidly advancing models.
  • Integrating AI efficiently, possibly even using AI to accelerate its own adoption, is a key opportunity.
  • Current adoption is highest in software engineering, but identifying other high-growth domains is an open question.

Recent Advancements: Claude 4 and Beyond 16:01

  • Claude 4 has improved its ability to act as an agent, with better coding, search, supervision, and memory functions.
  • The model can store and retrieve memories across context windows, enabling longer, more complex tasks.
  • Progress remains incremental but steady, following the scaling laws.

AI as Collaborator and Human-AI Workflows 18:09

  • Memory improvements in AI are enabling collaboration on longer tasks, now achievable on the scale of hours for complex work.
  • The distinction between human judgment and AI generation is shrinking, suggesting the human role will shift toward managing, supervising, and validating AI outputs.
  • Products are evolving from co-pilots (requiring human approval) to end-to-end full workflow replacements in some domains.

The Future of AI Automation Versus Collaboration 20:17

  • Some tasks can tolerate less-than-perfect AI performance and are being automated more quickly.
  • High-reliability tasks (99.9%+ correct) still require humans in the loop, but full automation in more areas is expected as reliability improves.
  • Human-AI teamwork is likely to dominate high-complexity areas in the near term.

Leveraging AI’s Breadth and Depth 21:21

  • AI's ability to draw on wide-ranging human knowledge can yield unique insights, especially in interdisciplinary scientific research.
  • Breadth (integrating diverse information) may be as important as depth (solving singular hard problems) for future AI applications.
  • Predicting exactly how these capabilities will be implemented is challenging, but general scaling trends are expected to continue.

Greenfield Opportunities for AI Builders 24:17

  • High-skill, computer-based tasks involving large data interactions—such as finance, legal, and business integrations—are promising fields for AI application.
  • Integrating AI into the fabric of existing businesses, analogous to reimagining factories during electrification, offers significant leverage.

Applying Physics Mindset to AI Research 26:22

  • Physics training helped Kaplan focus on identifying broad, precise trends (like scaling laws) in AI.
  • Asking naive but fundamental questions (such as the exact mathematical nature of learning curves) is valuable in AI’s young and rapidly evolving field.
  • In AI, studying very large neural networks leverages mathematical techniques familiar from physics.

Interpretability and Measurement in AI 29:15

  • Understanding AI interpretability is akin to neuroscience or biology, but AI allows for complete measurement of components, enabling thorough analysis.

Robustness and Limits of Scaling Laws 29:50

  • Scaling laws have proven robust, and deviations from expected trends often indicate training or engineering problems, not limits of the paradigm itself.

Efficiency, Compute, and Model Capabilities 31:04

  • AI training and inference are becoming rapidly more efficient, with 3x to 10x annual algorithmic improvements.
  • While costs will decrease, high-capability frontier models will likely continue to be in demand for their ability to handle complex, long-horizon tasks.

Advice for Early-Career Technologists 34:45

  • Building at the frontier of AI, understanding model mechanics, and efficiently leveraging and integrating AIs are key strategies for staying relevant.

Audience Q&A: Scaling, Self-Correction, and Training 35:24

  • Task horizon expansions may stem from improved self-correction and planning in AI, enabling larger leaps with modest ability increases.
  • For complex, long-horizon tasks outside coding, training data and verification become challenging. AI oversight (AI supervising AI) may improve efficiency and scalability.
  • Generating and curating training tasks increasingly combines both AI and human input as complexity grows.