homehome Home chatchat Notifications


DeepMind AI Matches Top Students in Solving Math Olympiad Problems

DeepMind's AI achieves medal-level performance in the International Mathematical Olympiad

Tibi Puiu
July 29, 2024 @ 9:08 pm

share Share

Credit: DeepMind.

Google’s DeepMind announced that its AI system has matched the performance of some of the world’s top students in solving math problems at the prestigious International Mathematical Olympiad (IMO).

After mastering highly complex games like Go and various strategy board games, beating all human champions at each game, DeepMind’s AI has now tackled complex mathematical challenges. The London-based machine-learning company revealed that its AI had solved four out of six problems presented at this year’s IMO in Bath, UK.

The AI’s solutions were thoroughly evaluated by top mathematicians, earning a score of 28 out of 42 — just one point shy of a gold medal but still good enough to win a silver medal. Last week, 58 competitors took home gold at the competition, while 123 others took home silver.

A Milestone in AI

Joseph Myers, a mathematician from Cambridge, UK, told Nature that this achievement is “a very substantial advance.” Myers, alongside Fields Medal-winner Tim Gowers, vetted the AI’s solutions and helped select the original problems for the IMO.

The ultimate goal of math-solving AI is to have these machines solve some of the most challenging open questions in the field. Among these is the Riemann hypothesis, a problem relating to the distribution of prime numbers that has stood unsolved for nearly 160 years. A proof would not only win the $1 million reward that comes for solving one of the seven Millennium Prize Problems established by the Clay Mathematics Institute in 2000, but it could also have applications in predicting prime numbers, important in cryptography.

Illustration of Problem 4 from the Math Olympiad.
Illustration of Problem 4, which asks to prove the sum of ∠KIL and ∠XPY equals 180°. AlphaGeometry 2 proposed to construct E, a point on the line BI so that ∠AEB = 90°. Point E helps give purpose to the midpoint L of AB, creating many pairs of similar triangles such as ABE ~ YBI and ALE ~ IPC needed to prove the conclusion. Credit: DeepMind.

However, current AI systems are a far way from approaching these long-standing math research problems. DeepMind’s latest accomplishment shows there’s good progress toward attending the required capabilities for cracking tough math problems because the IMO has always been a benchmark for such AI advancements.

This year, DeepMind’s AlphaGeometry2 solved the geometry problem in under 20 seconds. For algebra and number theory problems, the team developed a new system called AlphaProof, which successfully solved three problems over three days. Meanwhile, human participants are offered only two sessions of 4.5 hours each. Despite this progress, AlphaProof could not solve the two combinatorics problems.

Combining AI Techniques for Better Results

Popular AI technologies people are familiar with, such as ChatGPT, are large language models (LLMs). While LLMs are somewhat capable of solving math, they’re very much hit or miss. If you ask an LLM to explain a mathematical concept to you, like how to multiply two matrices together, chances are it will tell you how to do it correctly. However, if you ask it to multiply two matrices together, chances are the answer will be wrong.

AlphaProof leverages a combination of language models and reinforcement learning, employing the successful AlphaZero engine previously used in games like Go, which won 499 out of 500 games against top competitors like Crazy Stone and Zen in single-machine matches.

This approach enables the AI to learn through trial-and-error. However, AlphaProof needs a framework that tells it when it makes an error or is on the wrong track. For this purpose, the researchers added another subsystem that can read and write proofs in a formal mathematical language known as Lean.

AI Problem-solving

Since not enough math problems were available, the team trained an additional network to translate a million problems written in natural language into Lean. Although many translations were imperfect, they provided a foundation for AlphaProof’s reinforcement-learning cycles.

IMO problems are highly challenging because they often require creative leaps to solve. You can’t just brute force these problems. According to Gowers, many IMO problems have a “magic-key” property, and once you figure out the right angle to tackle the problem it can become trivial to solve.

AlphaProof’s ability to find these “magic keys” suggests it can make creative leaps in problem-solving, reminiscent of the famous “move 37” made by DeepMind’s AlphaGo in 2016. AlphaGo landed a surprise on the right-hand side of the 19-by-19 board that flummoxed even the world’s best Go players, including Lee Sedol. “That’s a very strange move,” said one commentator during a live stream, himself a nine dan Go player, the highest rank there is. “I thought it was a mistake,” said the other. 

While DeepMind’s AI systems are proving to be highly competent in solving challenging math problems, the leap to solving research-level mathematical questions remains uncertain. Nevertheless, DeepMind’s AI has reached a point where it can solve problems that challenge the world’s best young mathematicians.

“We’re at the point where they can prove not open research problems, but at least problems that are very challenging to the very best young mathematicians in the world,” said DeepMind computer scientist David Silver, one of the architects of AlphaGo.

share Share

The Soviets Built a Jet Powered Train and It Was as Wild as It Sounds

This thing was away of its time and is now building rust in a scrapyard.

This Tiny Nuclear Battery Could Last for Thousands of Years Without Charging

The radiocarbon battery is supposed to be safe for everyday operations.

Mathematician Who Bridged Algebra and the Quantum World Wins 2025 Abel Prize

This year, the Abel Prize — the field’s highest honor — has been awarded to Masaki Kashiwara, prolific Japanese mathematician whose work has quietly reshaped how we understand some of the most complex equations in existence. The Norwegian Academy of Science and Letters announced the award “for his fundamental contributions to algebraic analysis and representation […]

Meta's new AI can read your mind and type your thoughts with startling accuracy

Look like Mr. Zuckerberg is secretly working on something that could work as an alternative to invasive brain chips.

China just unveiled a deep-sea tool that can secretly cut the world’s internet cables

The concern is that China's new deep-sea cable cutter could spark global chaos.

Quadruped Robot Learns How to Ride Skateboard on Its Own

Researchers taught a legged robot to master skateboarding using a novel AI.

This AI-Powered Robot Just Made Breakfast and It Could Cook in Your Future Home

This $27,500 robot is the latest in a series of humanoid robots that have hit the market.

The Roundest (and Most Rectangular) Countries, According to Math

Apparently, Sierra Leone is both very round and quite rectangular.

A Cartoonish Crash Test Raises Real Questions About Tesla’s Autopilot

Mark Rober pits Tesla against lidar in a real-life Wile E. Coyote experiment.

Cambridge Scientists Develop Urine Test for Early Lung Cancer Detection

Lung cancer often goes undetected until it’s too late. But a new urine test developed by Cambridge scientists could change that.