AlphaGeometry Conquers Olympiad-Level Geometry

NYU Center for Data Science
5 min readFeb 22, 2024

Fermat’s Last Theorem, scribbled in the margin of an ancient Greek math text in 1637 by Pierre de Fermat, went unsolved for centuries. The Guinness Book of World Records called it the “most difficult mathematical problem” due to the number of unsuccessful attempts, down through the centuries, to prove it. It was solved in 1994 by British mathematician Andrew Wiles, an event that landed Wiles half a dozen of the most prestigious prizes available to mathematicians, as well as a building named after him at the University of Oxford.

Is this something that could be automated? This is the question that occurred to Trieu H. Trinh, a recent NYU Computer Science PhD graduate, at the outset of his PhD four years ago. Wiles’ achievement was impressive, but 357 years was a really long time to wait for a solution. “What if we could build an AI that could do the same?” he asked, “so that next time if we want to solve a hard problem, we don’t have to wait hundreds of years?”

Solving problems as difficult as Fermat’s Last Theorem would of course be very hard, but Trinh saw a way to get part of the way there: “the only way to make progress on these kinds of large questions is to consider a more toy example.” Geometry problems from the International Mathematical Olympiad (IMO) seemed like the perfect target — difficult, but designed for talented high school students, and so there always exists a solution well within reach. Designing an AI model to solve these problems became the challenge of Trinh’s PhD, which he undertook under the advisement of CDS Assistant Professor of Computer Science & Data Science He He.

Now, Trinh, He, and their team — including Yuhuai Wu, Quoc V. Le, and Thang Luong — have introduced AlphaGeometry in a landmark paper published in Nature and announced by DeepMind. The paper showcases an AI system capable of solving complex geometry problems at an efficiency very close to that of an average International Mathematical Olympiad (IMO) gold medallist. AlphaGeometry not only solved 25 out of 30 recent IMO problems but also presented its solutions in human-readable proofs, marking a significant milestone in the field of artificial intelligence. Reflecting on the project’s reception, Trinh shared, “It’s been fun,” acknowledging the whirlwind of attention and the exhaustive four-year journey that brought his idea to fruition.

He He added, “The obvious question is, why do we want to do that? One motivation is that mathematical reasoning is a hallmark of intelligence.”

Trinh identified two critical factors that made geometry an ideal domain for AI exploration: the potential for exhaustive deduction within geometry problems and the relative ease of generating interesting synthetic data for AI training from random exploration. This strategic approach allowed Trinh to circumvent the challenge of data scarcity that sometimes hampers AI’s application in mathematical domains.

He He elaborated on the project’s approach, saying, “We wanted to solve it without any human demonstrations, which deviates from the most common approach where you first get human data, and then you learn it in a supervised way.” Referencing AlphaGo and AlphaZero, DeepMind’s previous models that developed superhuman capabilities at Go by playing against itself, He said that the advantage of not using human demonstrations is that, “because you’re not constrained by human demonstrations, you’re giving the model a chance to do better than humans.”

At its core, AlphaGeometry is a product of two ingredients: a neural-symbolic system, and a novel way of generating this synthetic data. “Basically,” said Trinh, “I built a neural-symbolic system consisting of a neural language model and a symbolic engine, and put them together. The neural language model handles the part of geometry problems where we have to construct new points and lines in order to solve them. Whereas the symbolic engine handles the remaining heavy, mechanical deductions.”

“The tricky part,” according to Trinh, was that, to make it work, the team needed training data to train the language model. “For geometry, we don’t have much training data. And so we figured out a way to generate synthetic data.” By generating a vast pool of synthetic training data, Trinh’s system sidestepped the significant challenge of data deficiency, enabling the AI to navigate through the complexities of Euclidean plane geometry without human demonstrations.

He, reflecting on the team’s success, and praising Trieu’s doggedness and dedication to the mission, said, “I’m very happy with the result. It showed that we could achieve something very close to the performance of human gold medalists in solving geometry problems.”

Beyond its academic significance, Trinh envisions AlphaGeometry serving as a novel tool for educational support, offering students possible suggestions when they encounter obstacles in solving geometry problems. However, he emphasizes the importance of interaction in the educational process, acknowledging the current system’s limitations in facilitating a dialogue between students and the AI.

The development of AlphaGeometry benefitted from collaborations with the group formerly known as Google Brain, also known currently as Google DeepMind. This industry connection provided Trinh with the computational resources and intellectual community necessary to refine his prototype into a robust AI system. This journey, marked by interdisciplinary collaboration and a relentless pursuit of knowledge, exemplifies the innovative spirit driving advancements in artificial intelligence.

Reflecting on the broader implications of his work, Trinh articulates an optimistic outlook. “What we’ve built with AlphaGeometry is essentially a proof of concept,” he said, addressing the ambitious goal of creating AI capable of solving problems that have historically taken humanity centuries to unravel. This ‘toy example’ paves the way for future research aimed at bridging the gap between current capabilities and the dream of an AI that can independently tackle humanity’s most enduring mathematical puzzles. “The impact of AlphaGeometry on AI and math research is yet to be seen, but this work certainly offers some new ideas and inspiration for the fields and definitely moved the needle a bit.”

He expressed similar sentiments about the potential of the project’s methodology, saying, “There is reason to think this approach is generalizable to other domains. First, generating synthetic data using symbolic engines, and then filtering to select the high-quality data, and then training a model on this data — this recipe is general.”

Looking ahead, Trinh, He, and their colleagues are committed to studying the remaining challenges and identifying opportunities for progress, with the ultimate aim of enhancing AI’s problem-solving prowess. Their work stands as a testament to the ongoing journey of discovery and innovation in the intersection of artificial intelligence and mathematics.

By Stephen Thomas

--

--

NYU Center for Data Science

Official account of the Center for Data Science at NYU, home of the Undergraduate, Master’s, and Ph.D. programs in Data Science.