Can computers do mathematical research?

By David H Bailey, on September 1st, 2020

Courtesy Adarsh Pathak

AI comes of age

The modern field of artificial intelligence (AI) began in 1950 with Alan Turing’s landmark paper Computing machinery and intelligence, which outlined the principles of AI and proposed the Turing test. Although early researchers were confident that AI systems would soon be a reality, inflated promises and expectations led to disappointment in the 1970s and again in the 1980s.

A breakthrough of sorts came in the late 1990s and early 2000s with the emergence of Bayes-theorem-based methods, which quickly displaced the older methods based mostly on formal reasoning. When combined with steadily advancing computer technology, a gift of Moore’s Law, practical and effective AI systems finally began to appear.

One notable milestone in modern AI technology came in March 2016, when a computer program named “AlphaGo,” developed by researchers at DeepMind, a subsidiary of Alphabet (Google’s parent company), defeated Lee Se-dol, a South Korean Go master, 4-1 in a 5-game tournament, an achievement that many observers had not expected to occur for decades, if ever. Then in October 2017, Deep Mind researchers developed from scratch a new program, called AlphaGo Zero, which was programmed only with the rules of Go and a simple reward function and then instructed to play games against itself. After just three days of training (4.9 million training games), the AlphaGo Zero program had advanced to the point that it defeated the earlier Alpha Go program 100 games to zero, and after 40 days of training, AlphaGo Zero’s performance was as far ahead of champion human players as champion human players are ahead of amateurs. See this Economist article, this Scientific American article and this Nature article.

AI and machine-learning methods are being used for more than playing Go. They are used in Apple’s Siri and Amazon’s Alexa voice recognition systems, in Facebook’s facial recognition API, in Apple’s 3-D facial recognition hardware and software, and in Tesla’s “autopilot” facility. See this earlier Math Scholar blog for additional discussion.

Computer discovery of mathematical theorems

The present author recalls discussing the future of mathematics with Paul Cohen, who in 1963 proved that the continuum hypothesis is independent from the axioms of Zermelo-Fraenkel set theory. Cohen was convinced that the future of mathematics, and much more, lies in artificial intelligence. Reuben Hersch recalls Cohen saying specifically that at some point in the future mathematicians would be replaced by computers. So how close are we to Cohen’s vision?

In fact, computer programs that discover new mathematical identities and theorems are already a staple of the field known as experimental mathematics. Here is just a handful of the many computer-based discoveries that could be mentioned:

A new formula for pi with the property that it permits one to rapidly calculate binary or hexadecimal digits of pi at an arbitrary starting position, without needing to calculate digits that came before.
The surprising fact that if the Gregory series for pi is truncated to 10ⁿ/2 terms for some n, the resulting decimal value is remarkably close to the true value of pi, except for periodic errors that are related to the “tangent numbers.”
Evaluations of Euler sums (compound infinite series) in terms of simple mathematical expressions.
Evaluations of lattice sums from the Poisson equation of mathematical physics in terms of roots of high-degree integer polynomials.
A new result for Mordell’s cube sum problem.

Computer proofs of mathematical theorems

In most of the above examples, the new mathematical facts in question were found by numerical exploration on a computer, and then later proven rigorously, the old-fashioned way, by mostly human efforts. So what about computers actually proving theorems?

Actually, this is also old hat at this point in time. Perhaps the best example is Thomas Hales’ 2003 proof of the Kepler conjecture, namely the assertion that the simple scheme of stacking oranges typically seen in a supermarket has the highest possible average density for any possible arrangement, regular or irregular. Hales’ original proof met with some controversy, since it involved a computation documented by 250 pages of notes and 3 Gbyte of computer code and data. So Hales and his colleagues began entering the entire proof into a computer proof-checking program. In 2014 this process was completed and the proof was certified.

In November 2019, researchers at Google’s research center in Mountain View, California, published results for a new AI theorem-proving program. This program works with the HOL-Light theorem prover, which was used in Hales’ proof of the Kepler conjecture, and can prove, essentially unaided by humans, many basic theorems of mathematics. They have provided their tool in an open-source release, so that other mathematicians and computer scientists can experiment with it.

The Google AI system was trained on a set of 10,200 theorems that the researchers had gleaned from several sources, including many sub-theorems of Hales’ proof of the Kepler conjecture. Most of these theorems were in the area of linear algebra, real analysis and complex analysis, but the Google researchers emphasize that their approach is very broadly applicable. In the initial release, their software was able to prove 5919, or 58% of the training set. When they applied their software to a set of 3217 new theorems that it had not yet seen, it succeeded in proving 1251, or 38.9%.

Mathematicians are already envisioning how this software can be used in day-to-day research. Jeremy Avigad of Carnegie Mellon University sees it this way:

You get the maximum of precision and correctness all really spelled out, but you don’t have to do the work of filling in the details. … Maybe offloading some things that we used to do by hand frees us up for looking for new concepts and asking new questions.

For additional details see the Google authors’ technical paper, and a New Scientist article by Leah Crane.

How close are we to fully automatic mathematical reasoning?

Present-day computerized theorem provers are typically categorized as automated theorem provers (ATPs), which use computationally intensive methods to search for a proof; and interactive theorem provers, which rely on an interplay with humans — verifying the correctness of an argument and checking proofs for logical errors. Researchers in the field, however, acknowledge that both types of software are still a far cry from a completely independent computer-based mathematical reasoning system.

For one thing, these tools have been met with cold shoulders by many present-day mathematicians, in part because they require considerable study and practice to become proficient in using them, and in part because of a general distaste for the notion of automating mathematical thought.

But some mathematicians are embracing these tools. Kevin Buzzard of Imperial College London, for example, has begun to focus his research on computerized theorem provers. But he acknowledges, “Computers have done amazing calculations for us, but they have never solved a hard problem on their own. … Until they do, mathematicians aren’t going to be buying into this stuff.”

Emily Riehl, of Johns Hopkins University, teaches students to write mathematical proofs using a computerized tool. She reports that these tools help students to rigorously formulate their ideas. Even for her own research, she says that “using a proof assistant has changed the way I think about writing proofs.”

Vladimir Voevodsky of Princeton University, after finding an error in one of his own published results, was an ardent advocate of using computers to check proofs, until his death in 2017. Timothy Gowers of Cambridge, who won the Fields Medal in 1998, goes even further, saying that major mathematical journals should prepare for the day when the authors of all submitted papers must first certify that their results have been verified by a computerized proof checker.

Josef Urban of the Czech Institute of Informatics, Robotics and Cybernetics believes that a combination of computerized theorem provers and machine learning tools is required to produce human-like mathematical research capabilities. In July 2020 his group reported on some new mathematical conjectures generated by a neural network that was trained on millions of theorems and proofs. The network proposed more than 50,000 new formulas, but, as they acknowledged, many of these were duplicates: “It seems that we are not yet capable of proving the more interesting conjectures.”

A research team led by Christian Szegedy of Google Research sees automated theorem provers as a subset of the field of natural language processing, and plans to capitalize on recent advances in the field to demonstrate solid mathematical reasoning. He and other researchers have proposed a “skip-tree” task scheme that exhibits suprisingly strong mathematical reasoning capabilities. Out of thousands of generated conjectures, about 13% were both provable and new, in the sense of not merely being duplicates of other theorems in the database.

For additional examples and discussion of recent research in this area, see this Quanta Magazine article by Stephen Ornes.

What will the future hold?

So where is all this heading? With regards to computer mathematics, Timothy Gowers predicts that computers will be able to outperform human mathematicians by 2099. He says that this may lead to a brief golden age, when mathematicians still dominate in original thinking and computer programs focus on technical details, “but I think it will last a very short time,” given that there is no reason that computers cannot eventually become proficient at the more creative aspects of mathematical research as well.

Futurist Ray Kurzweil predicts that at an even earlier era (roughly 2045), machine intelligence will first meet, then transcend human intelligence, leading to even more powerful technology, in a dizzying cycle that we can only dimly imagine (a singularity). Like Gowers, Kurzweil does not see any reason that creative aspects of human thinking, such as mathematical reasoning, will be immune from these developments.

Not everyone is overjoyed with these prospects. Bill Joy, for one, is concerned that in Kurzweil’s singularity, humans could be relegated to minor players, if not ultimately extinguished. However, it must be acknowledged even today, AI-like systems already handle many important decision-making processes, ranging from finance and investment to weather prediction, using decision-making processes that humans only dimly understand.

One implication of all this is that mathematical training, both for mathematics majors and other students, must aggressively incorporate computer technology and teach computer methods for mathematical analysis and research, at all levels of study. Topics for prospective mathematicians should include a solid background in computer science (programming, data structures and algorithms), together with statistics, numerical analysis, machine learning and symbolic computing (or at least the usage of a symbolic computing tool such as Mathematica, Maple or Sage).

In a similar way, university departments of engineering, physics, chemistry, finance, medicine, law and social sciences need to significantly upgrade their training in computer skills — computer programming, machine learning, statistics and graphics. Large technology firms such as Amazon, Apple, Facebook, Google and Microsoft are already aggressively luring top mathematical, computer science and machine learning talent. Other employers, in other fields, will soon be seeking the same pool of candidates.

In short, one way or the other intelligent computers are coming, and are destined to transform fields ranging from mathematical research to law and medicine. Society in general must find a way to accommodate this technology, and to deal respectfully with the many people whose lives will be affected. But not all is gloom and doom. Mathematician Steven Strogatz envisions a mixed future:

Maybe eventually our lack of insight would no longer bother us. After all, AlphaInfinity could cure all our diseases, solve all our scientific problems and make all our other intellectual trains run on time. We did pretty well without much insight for the first 300,000 years or so of our existence as Homo sapiens. And we’ll have no shortage of memory: we will recall with pride the golden era of human insight, this glorious interlude, a few thousand years long, between our uncomprehending past and our incomprehensible future.

Math Scholar