NVIDIA’s New AI Cheated At Parkour…And Got Fixed!

AI Parkour Training Process 00:00

  • An AI player is taught to survive in a challenging, parkour-like environment using motion capture data derived from real humans.
  • The initial motion capture dataset consists of only 14 minutes of material, which is limited for meaningful training.
  • The training method involves three steps: utilizing available human motion data, generating randomized new levels, and using a physics-based engine to create new, plausible motions on these levels.

Physics Correction and Data Enrichment 01:12

  • The AI's kinematic motions (dreamed up by the AI) often contain physically implausible behaviors such as floating or foot sliding, which are considered "cheating."
  • A physics engine is used to correct these motions, making them physically believable.
  • Newly corrected movements are added to the original small dataset, progressively enlarging it across multiple cycles.
  • Characters are tasked with following generated paths that involve various parkour actions like climbing and jumping.

Cheating Detection and Iterative Improvement 02:06

  • Early attempts at motion enrichment result in the AI taking non-physical shortcuts (cheating) that are subsequently corrected through physics-based filtering.
  • After three cycles of data enrichment and correction, the AI demonstrates significantly more realistic and complex movement abilities.

AI Skill Development and Generalization 02:42

  • The AI learns to combine multiple movements (e.g., jumping, grabbing, and climbing up edges) in novel ways not present in the original dataset.
  • Its abilities are tested on environments it has never seen before, demonstrating strong generalization.
  • The green character represents uncorrected AI motions, while the blue character has undergone physics-based correction and is more physically plausible.

Natural Movements and Surprising Results 03:50

  • The AI performs advanced maneuvers, such as sequential jumps and even naturally hopping on one leg without hesitating, producing realistic and fluid behavior.
  • The AI effectively handles challenging new levels, like climbing complex monuments and spirals.

Research Insights and Technical Details 04:37

  • Each original motion capture clip is transformed into 50 different terrain variations, greatly increasing training data diversity from a small base.
  • The training can be accomplished using a single high-end graphics card (NVIDIA A6000), taking up to a month to complete.
  • The motion generation process is slow, requiring about 25 seconds of computation for each second of character movement.

Limitations and Future Possibilities 05:19

  • Despite successes, the method remains slow and computationally demanding, especially for real-time applications.
  • There is optimism about further improvements and the potential adoption of similar technology in video games and virtual worlds.

Outro & Sponsor Mention 06:01

  • The video briefly promotes Deep Infra, an AI inference cloud platform offering affordable access to advanced AI models and tools.