Why We Don’t Need More Data Centers - Dr. Jasper Zhang, Hyperbolic

Introduction and AI Compute Challenges 00:01

  • Jasper Zhang introduces Hyperbolic, an AI cloud platform for developers, and clarifies that while building data centers is important, constructing more alone does not address current compute challenges.
  • AI integration across industries is driving an explosion in demand for GPUs and data center capacity.
  • McKinsey projects that by 2030, data center capacity needs will quadruple within a quarter of the usual construction time.
  • Current global data center capacity is 55 GW, with an anticipated demand of 219 GW by 2030, representing a 22% annual growth rate.
  • Building new data centers is slow, expensive (e.g., construction of a single large data center can exceed $1 billion), and faces long electrical grid connection times (up to seven years in places like Northern Virginia).
  • Data centers are significant energy consumers, accounting for 4% of total US electricity use, and are associated with high annual CO2 emissions.
  • Even with timely construction, a supply deficit of over 15 GW in US data centers is expected by 2030.

GPU Utilization and Fragmentation 03:59

  • GPU utilization within enterprises is low, with GPUs idle about 80% of the time.
  • The GPU cloud market is highly fragmented, with over 100 different providers, leading to inefficient GPU matching and disparate pricing.
  • Many users face challenges in accessing GPUs due to the inability to find available resources or are forced to pay premium prices, while many data center GPUs remain underutilized.

Solution: GPU Marketplace and Hyperbolic’s Approach 04:35

  • Proposes creating a GPU marketplace or aggregation layer to pool resources from various data centers and GPU providers.
  • Hyperbolic has developed HyperDOS (Hyperbolic Distributed Operating System), which acts as a Kubernetes-like software layer, allowing data centers to join the network and making their GPUs available for rent within five minutes of installation.
  • Users have multiple renting options: spot instances, on-demand, long-term reservations, or hosting models on the aggregated network.
  • This approach provides flexibility and commoditizes GPU access, streamlining the procurement process and eliminating the need for founders or startups to vet multiple suppliers.
  • Marketplace will also include benchmarking information on GPU performance to aid user decisions.

Cost Savings and Productivity Gains 06:09

  • Mathematical modeling shows potential cost reductions of 50–75% for GPU users on the marketplace.
  • Example: Hyperbolic’s beta offers H100 GPUs at $0.99/hour, compared to $11/hour on Google and $2–3/hour on Lambda.
  • Aggregating GPU supply and providing uniform distribution leads to significant price drops, supported by queuing theory principles.
  • Users can save substantial time by utilizing the platform, eliminating the need for multiple supplier vetting processes.

Use Case: Startup Resource Flexibility 08:22

  • Scenario: A startup initially rents 1,000 GPUs for training, later requires 10,000 for a month, and eventually only needs 500 for hosting.
  • On traditional clouds, users might be forced to overcommit and underutilize resources, leading to unnecessary costs.
  • With Hyperbolic, resources can be flexibly rented and released for resale, reducing costs from $43.8 million to $6.9 million—a 6x saving.
  • Idle GPUs can be made available to others, benefiting the wider AI community and further driving down costs.

Strategic Impact and Future Outlook 10:31

  • Enhanced productivity: Budget savings translate to greater compute access and, by scaling law, potentially improved AI model quality.
  • Lower barriers to entry: Startups can access more affordable compute, reducing reliance on closed AI models from major providers.
  • Vision: The GPU marketplace could evolve into an all-in-one platform supporting various AI workloads, including online and offline inference and training.

Environmental and Operational Sustainability 11:50

  • Relying solely on building data centers is unsustainable due to land, energy, and emissions concerns.
  • A marketplace-driven approach promotes recycling and reuse of idle compute, leading to smarter resource allocation and a lower environmental footprint.
  • Hyperbolic is launching enterprise-grade products offering 99.5% GPU reliability alongside its marketplace.

Technical Q&A: HyperDOS Implementation 12:52

  • HyperDOS operates like a Kubernetes agent and can be installed on any Kubernetes-ready cluster, including data centers, laptops, or desktops.
  • Within Hyperbolic, clusters (termed “barons”) are managed by a central “monarch” server, which handles user requests, provisions machines, and sets up remote access for users.