Computer discovery of mathematical theorems
In 1983 the present author recalls discussing the future of mathematics with Paul Cohen, who in 1963 proved that the continuum hypothesis is independent from the axioms of Zermelo-Fraenkel set theory. Cohen was convinced that the future of mathematics, and much more, lies in artificial intelligence. Reuben Hersch recalls Cohen saying specifically that at some point in the future mathematicians would be replaced by computers. So how close are we to Cohen’s vision?
In fact, computer programs that discover new mathematical identities and theorems are already a staple of the field known as experimental mathematics. Here is just a handful of the many computer-based discoveries that could be mentioned (including some co-authored by the present author):
- A new formula for pi with the property that it permits one to rapidly calculate binary or hexadecimal digits of pi at an arbitrary starting position, without needing to calculate digits that came before.
- The surprising fact that if the Gregory series for pi is truncated to 10n/2 terms for some n, the resulting decimal value is remarkably close to the true value of pi, except for periodic errors that are related to the “tangent numbers.”
- Evaluations of Euler sums (compound infinite series) in terms of simple mathematical expressions.
- Evaluations of lattice sums from the Poisson equation of mathematical physics in terms of roots of high-degree integer polynomials.
- A new result for Mordell’s cube sum problem.
Computer proofs of mathematical theorems
In most of the above examples, the new mathematical facts in question were discovered by numerical exploration on a computer, and then later proven rigorously, the old-fashioned way, by mostly human efforts. So what about computers actually proving theorems?
Actually, this is also old hat at this point in time. Perhaps the best example is Thomas Hales’ 2003 proof of the Kepler conjecture, namely the assertion that the simple scheme of stacking spheres (think of oranges in a supermarket) has the highest possible average density for any possible arrangement, regular or irregular. Hales’ original proof met with some controversy, since it involved a computation documented by 250 pages of notes and 3 Gbyte of computer code and data. So Hales and his colleagues began entering the entire proof into a computer proof-checking program (a combination of Isabelle and HOL Light). In 2014 this process was completed and the proof was certified.
In November 2019, researchers at Google’s research center in Mountain View, California, published results for a new AI theorem-proving program. This program works with the HOL-Light theorem prover, and can prove, essentially unaided by humans, many basic theorems of mathematics. They have provided their tool in an open-source release, so that other mathematicians and computer scientists can experiment with it.
The Google AI system was trained on a set of 10,200 theorems that the researchers had gleaned from several sources, including many sub-theorems of Hales’ proof of the Kepler conjecture. Most of these theorems were in the area of linear algebra, real analysis and complex analysis, but the Google researchers emphasize that their approach is very broadly applicable. In the initial release, their software was able to prove 5919, or 58% of the training set. When they applied their software to a set of 3217 new theorems that it had not yet seen, it succeeded in proving 1251, or 38.9%.
Mathematicians are already envisioning how this software can be used in day-to-day research. Jeremy Avigad of Carnegie Mellon University sees it this way:
You get the maximum of precision and correctness all really spelled out, but you don’t have to do the work of filling in the details. … Maybe offloading some things that we used to do by hand frees us up for looking for new concepts and asking new questions.
For additional details see the Google authors’ technical paper, and a New Scientist article by Leah Crane.
How close are we to fully automatic mathematical reasoning?
Present-day computerized theorem provers are typically categorized as automated theorem provers (ATPs), which use computationally intensive methods to search for a proof; and interactive theorem provers, which rely on an interplay with humans — verifying the correctness of an argument and checking proofs for logical errors. Researchers in the field, however, acknowledge that both types of software are still a far cry from a completely independent computer-based mathematical reasoning system.
For one thing, these tools have been met with cold shoulders by many present-day mathematicians, in part because they require considerable study and practice to become proficient in using them, and in part because of a general distaste for the notion of automating mathematical thought.
But some mathematicians are embracing these tools. Kevin Buzzard of Imperial College London, for example, has begun to focus his research on computerized theorem provers. But he acknowledges, “Computers have done amazing calculations for us, but they have never solved a hard problem on their own. … Until they do, mathematicians aren’t going to be buying into this stuff.”
Emily Riehl, of Johns Hopkins University, teaches students to write mathematical proofs using a computerized tool. She reports that these tools help students to rigorously formulate their ideas. Even for her own research, she says that “using a proof assistant has changed the way I think about writing proofs.”
Vladimir Voevodsky of Princeton University, after finding an error in one of his own published results, was an ardent advocate of using computers to check proofs, until his death in 2017. Timothy Gowers of Cambridge, who won the Fields Medal in 1998, goes even further, saying that major mathematical journals should prepare for the day when the authors of all submitted papers must first certify that their results have been verified by a computerized proof checker.
Josef Urban of the Czech Institute of Informatics, Robotics and Cybernetics believes that a combination of computerized theorem provers and machine learning tools is required to produce human-like mathematical research capabilities. In July 2020 his group reported on some new mathematical conjectures generated by a neural network that was trained on millions of theorems and proofs. The network proposed more than 50,000 new formulas, but, as they acknowledged, many of these were duplicates: “It seems that we are not yet capable of proving the more interesting conjectures.”
A research team led by Christian Szegedy of Google Research sees automated theorem provers as a subset of the field of natural language processing, and plans to capitalize on recent advances in the field to demonstrate solid mathematical reasoning. He and other researchers have proposed a “skip-tree” task scheme that exhibits suprisingly strong mathematical reasoning capabilities. Out of thousands of generated conjectures, about 13% were both provable and new, in the sense of not merely being duplicates of other theorems in the database.
Adam Zsolt Wagner of Tel Aviv University in Israel employed an AI approach to search for counterexamples to a set of long-standing conjectures in graph theory. These were rather modest computations — Wagner ran his program on his five-year-old laptop, which required between two and 48 hours run time to discover the successful counterexamples.
For additional examples and discussion of recent research in this area, see this Quanta Magazine article by Stephen Ornes.
Computer theorem prover verifies sophisticated new result
In most of the above applications of automated theorem provers and machine learning-based conjecture checkers, the results in question were relatively unsophisticated, in the sense that they dealt with results that are rooted in fairly well-understood mathematical structures and techniques.
In a promising new development, an automated theorem proving software system has verified a very difficult new result at the cutting edge of research mathematics.
The result in question is in the field dubbed “condensed mathematics,” wherein the the traditional notion of a topological space (the study of continuous transformations of objects, typically in fairly conventional geometric spaces, such as 3-D or n-D space) is replaced with a more general notion called a “condensed set” — a space that can be thought of informally as a very general assemblage of infinite numbers of points glued together.
The specific result was by Peter Scholze of the University of Bonn, a mathematician with very impressive credentials, including the 2018 Fields Medal, arguably the highest honor in mathematics. In July 2019 he established that the field of functional analysis still works if one replaces conventional topological spaces with condensed sets. Scholze did not actually write the proof down in full until November 2019. At that point he contacted Kevin Buzzard (mentioned above), a mathematician at Imperial College, London, who referred him to Johan Commelin, a postdoctoral researcher at the University of Freiburg, Germany. Commelin then led an effort ultimately involving more than 12 mathematicians in an effort to encode the proof using the proof verifier system Lean.
The project proceeded in steps, with Scholze communicating steps of the proof to Commelin, who then posted them to a bulletin board of Lean users. When a mathematician on the bulletin board saw one of the steps that that fit their particular area of expertise, they would volunteer to formalize it in Lean. The consortium finished its work on May 29, 2021. The final run of Lean verified the proof in full detail.
For additional background and details, see this Quanta Magazine article by Kevin Hartnett.
What will the future hold?
So where is all this heading? With regards to computer mathematics, Timothy Gowers predicts that computers will be able to outperform human mathematicians by 2099. He says that this may lead to a brief golden age, when mathematicians still dominate in original thinking and computer programs focus on technical details, “but I think it will last a very short time,” given that there is no reason that computers cannot eventually become proficient at the more creative aspects of mathematical research as well.
One implication of all this is that mathematical training, both for mathematics majors and other students, must aggressively incorporate computer technology and teach computer methods for mathematical analysis and research, at all levels of study. Topics for prospective mathematicians should include a solid background in computer science (programming, data structures and algorithms), together with statistics, numerical analysis, machine learning and symbolic computing (or at least the usage of a symbolic computing tool such as Mathematica, Maple or Sage).
In a similar way, university departments of engineering, physics, chemistry, finance, medicine, law and social sciences need to significantly upgrade their training in computer skills — computer programming, machine learning, statistics and graphics. Large technology firms such as Amazon, Apple, Facebook, Google and Microsoft are already aggressively luring top mathematical, computer science and machine learning talent. Other employers, in other fields, will soon be seeking the same pool of candidates.
In short, one way or the other intelligent computers are coming, and are destined to transform fields ranging from mathematical research to law and medicine. Society in general must find a way to accommodate this technology, and to deal respectfully with the many people whose lives will be affected. But not all is gloom and doom. Mathematician Steven Strogatz envisions a mixed future:
Maybe eventually our lack of insight would no longer bother us. After all, AlphaInfinity could cure all our diseases, solve all our scientific problems and make all our other intellectual trains run on time. We did pretty well without much insight for the first 300,000 years or so of our existence as Homo sapiens. And we’ll have no shortage of memory: we will recall with pride the golden era of human insight, this glorious interlude, a few thousand years long, between our uncomprehending past and our incomprehensible future.