SUMM

Alpha Evolve is an AI coding agent developed by Google DeepMind that discovers new algorithms.
It uses Gemini models and evolutionary search to make new scientific discoveries on open problems.
The discovered algorithms are practical enough to be deployed in key parts of Google's infrastructure.
It demonstrates technical creativity, akin to AlphaGo's "move 37," and is considered a step towards self-improving AI.

DeepMind's long-standing mission is to build AI responsibly to benefit humanity, specifically by discovering new algorithms to solve computational problems more efficiently.
The first breakthrough was AlphaTensor in 2022, an AI system using reinforcement learning to discover better matrix multiplication algorithms than humans.
AlphaGo's ability to efficiently explore large search spaces in Go, leading to creative, non-human-like moves, provided the philosophical basis for searching for algorithms.
Matrix multiplication, a fundamental operation, was believed to have a cubic complexity until Strassen's counterintuitive discovery over 50 years ago, which AlphaTensor further improved.
The goal evolved from a specific problem like matrix multiplication (AlphaTensor) to a more general agent that could search naturally in the space of programs, leading to FunSearch (an LLM-based agent for scientific discovery) and then Alpha Evolve.

Solutions like Strassen's are ingenious and non-intuitive, and for larger problem sizes, the search space becomes incredibly vast with intricate constructions.
The prior lack of discovery was not due to complacency from human researchers; the problems Alpha Evolve tackled had been worked on for decades by top experts.
The system's ability to discover new, practical solutions in heavily optimized areas demonstrates the novelty and inherent difficulty of these problems.

Users define the problem by providing an "evaluation function" that objectively measures how good a proposed solution is (e.g., a simulator for data center job scheduling).
Alpha Evolve then "fills in the how," either starting from scratch or from a strong initial solution provided by the user.
It combines the creative power of large language models to propose improvements with the strictness of the evaluation function to filter and validate ideas.
This process is wrapped within an evolutionary algorithm that explores the space of algorithms, maintaining a diverse pool of potential solutions over time.
Each "generation" of the evolutionary process aims to get stronger by combining ideas from existing strong solutions or introducing new ones.
Alpha Evolve can adapt to the difficulty of problems, quickly solving easy ones or sustaining continuous improvement for extended periods on difficult, open scientific challenges without plateauing.

General coding agents often struggle with "trivial" problems or plateau because they make mistakes and operate on partial specifications.
Alpha Evolve leverages the "hallucinations" (creative but potentially incorrect answers) of LLMs by using a robust evaluation function to discern working solutions from non-working ones.
The availability of precise evaluation functions is crucial, unlocking the potential for AI agents to discover solutions beyond human capabilities.
The reliance on strict evaluators might be relaxed in the future, with language models themselves potentially evaluating proposed solutions, as demonstrated by the AI Co-scientist project.
LLM-based agents, particularly with population-based evolutionary approaches, are highly effective at searching large spaces and finding counterintuitive solutions to long-standing problems.
Evaluation can also involve proof agents to verify properties or LLM-based evaluators that "guess" the quality of a solution, which has shown to produce better results than using a base LLM alone.

Alpha Evolve's ability to optimize the systems used to train itself (e.g., achieving a 23% speedup in parts of the Gemini model's training infrastructure) is an early sign of self-improvement.
Currently, this self-improvement primarily focuses on computational efficiency rather than fundamentally enhancing the model's cognitive tasks.
The long-term question is whether this self-improvement will be a one-off benefit, converge to a limit, or lead to continuous, accumulating improvements without bounds.

AI's application to accelerating scientific discovery, especially in mathematics and computer science, is highly promising due to the common availability of automated evaluation functions.
In fields like biology or chemistry, predictive models or simulators can serve as evaluators for designing molecules.
Alpha Evolve is a versatile tool applicable across many branches of science, and its capabilities are expected to expand further.
Science often involves searching for the right ideas or solutions, and AI provides a "superpower" for scientists to explore complex and counterintuitive solution spaces.
The human role in this collaboration involves defining the evaluation function, specifying desired properties, and setting constraints for the solutions.
Alpha Evolve is seen as an amazing tool that enhances the productivity and effectiveness of mathematicians and computer scientists.
The system often provides the algorithm for constructing a solution, which is valuable for understanding the underlying ideas and nature of the universe, particularly in mathematics.
The code generated by Alpha Evolve is "humanlike" and interpretable, allowing human experts to inspect, understand, and make final deployment decisions, unlike opaque neural networks.

Google aims to make Alpha Evolve's capabilities accessible to a wider community, currently through a trusted tester program to understand optimal usage.
Broad accessibility faces challenges due to the need for specific evaluation functions and significant computational resources, especially for difficult problems.
Alpha Evolve is already being used internally at Google to improve data center efficiency, hardware design, and software efficiency across its computational infrastructure.
Many exciting computational problems within and outside AI at Google remain to be explored with Alpha Evolve, with promising future results anticipated.

No Priors Ep. 120 | With Google DeepMind’s Pushmeet Kohli and Matej Balog