Terence Tao’s vision of AI assistants in research mathematics

Credit: Valentin Tkach for Quanta Magazine

Computer-assisted tools for research mathematics

In a previous Math Scholar article, we highlighted some recent developments in sophisticated computer tools being applied to the enterprise of research mathematics. These tools include:

  1. Typesetting tools (usually LaTeX), combined with tools such as MathJax for embedding typeset mathematics into web pages.
  2. Collaboration tools such as blogs, FaceTime, Skype, Zoom, Slack and Microsoft Teams to facilitate communication and collaboration between researchers.
  3. Symbolic mathematical software such as Mathematica, Maple and Sage to perform increasingly powerful symbolic manipulations and derivations, and to generate graphics illustrating results.
  4. Custom-written code, often involving algorithms such as Ferguson’s PSLQ integer relation algorithm or the QQQ algorithm, together with very high-precision floating-point arithmetic, to discover new mathematical relations and identities.
  5. Formal mathematical software, such as Lean, Coq, HOL and Isabelle to encode and rigorously confirm all steps of a formal mathematical proof.

Terence Tao on proof checkers, AI and the future of mathematical research

The previous Math Scholar article also included excerpts from an interview between UCLA Fields Medalist mathematician Terence Tao and Christoph Drosser, a writer for Spektrum der Wissenschaft (the German edition of Scientific American). Drosser and Tao discussed at length the current status of computer tools in mathematical research, including proof-checking software and also AI-based tools, and what is likely to be the future course of the field. In the interview, Tao noted that the rise of automated proof checkers (his favorite is Lean) has fundamentally changed the trust equation between mathematical collaborators. Now much larger teams can mutually divide a proof task, with each collaborator’s work being fed into a proof checker to ensure validity.

Tao has now shared more thoughts on his vision of artificial intelligence and other advanced computer-based tools to assist mathematical researchers. In an Atlantic article dated 4 October 2024, edited with Atlantic writer Matteo Wong, Tao provided some more details of his vision. Here are a few excerpts from his interview:

Wong: What was your first experience with ChatGPT?

Tao: I played with it pretty much as soon as it came out. I posed some difficult math problems, and it gave pretty silly results. It was coherent English, it mentioned the right words, but there was very little depth. Anything really advanced, the early GPTs were not impressive at all. They were good for fun things—like if you wanted to explain some mathematical topic as a poem or as a story for kids. Those are quite impressive.

Wong: OpenAI says o1 [a new release focused on mathematics and science] can “reason,” but you compared the model to “a mediocre, but not completely incompetent” graduate student [clarified to mean “research assistant”].

Tao: Right, it’s the equivalent, in terms of serving as that kind of an assistant. But I do envision a future where you do research through a conversation with a chatbot. Say you have an idea, and the chatbot went with it and filled out all the details.

It’s already happening in some other areas. AI famously conquered chess years ago, but chess is still thriving today, because it’s now possible for a reasonably good chess player to speculate what moves are good in what situations, and they can use the chess engines to check 20 moves ahead. I can see this sort of thing happening in mathematics eventually: You have a project and ask, “What if I try this approach?” And instead of spending hours and hours actually trying to make it work, you guide a GPT to do it for you.

With o1, you can kind of do this. I gave it a problem I knew how to solve, and I tried to guide the model. First I gave it a hint, and it ignored the hint and did something else, which didn’t work. When I explained this, it apologized and said, “Okay, I’ll do it your way.” And then it carried out my instructions reasonably well, and then it got stuck again, and I had to correct it again. The model never figured out the most clever steps. It could do all the routine things, but it was very unimaginative.

One key difference between graduate students and AI is that graduate students learn. You tell an AI its approach doesn’t work, it apologizes, it will maybe temporarily correct its course, but sometimes it just snaps back to the thing it tried before. And if you start a new session with AI, you go back to square one. I’m much more patient with graduate students because I know that even if a graduate student completely fails to solve a task, they have potential to learn and self-correct.

[some discussion skipped]

Wong: You’ve also said previously that computer programs might transform mathematics and make it easier for humans to collaborate with one another. How so? And does generative AI have anything to contribute here?

Tao: Technically they aren’t classified as AI, but proof assistants are useful computer tools that check whether a mathematical argument is correct or not. They enable large-scale collaboration in mathematics. That’s a very recent advent.

Math can be very fragile: If one step in a proof is wrong, the whole argument can collapse. If you make a collaborative project with 100 people, you break your proof in 100 pieces and everybody contributes one. But if they don’t coordinate with one another, the pieces might not fit properly. Because of this, it’s very rare to see more than five people on a single project.

With proof assistants, you don’t need to trust the people you’re working with, because the program gives you this 100 percent guarantee. Then you can do factory production–type, industrial-scale mathematics, which doesn’t really exist right now. One person focuses on just proving certain types of results, like a modern supply chain.

The problem is these programs are very fussy. You have to write your argument in a specialized language—you can’t just write it in English. AI may be able to do some translation from human language to the programs. Translating one language to another is almost exactly what large language models are designed to do. The dream is that you just have a conversation with a chatbot explaining your proof, and the chatbot would convert it into a proof-system language as you go.

What does the future hold?

In short, the consensus of Tao and other leading-edge researchers in the area of computer-assisted mathematical research is that the technology is still in relative infancy. Much more work must be done before these tools can be mainstays of the research community in the same way, say, that Mathematica, Maple and Sage are for symbolic manipulations. And, frankly, the central idea and inspiration for a proof still must arise from a human mathematician. In other words, these computer- and AI-based tools are better thought of as “co-pilots.” As Terence Tao explained in the previous interview (see here for the full copy):

Tao: I think in three years AI will become useful for mathematicians. It will be a great co-pilot. You’re trying to prove a theorem, and there’s one step that you think is true, but you can’t quite see how it’s true. And you can say, “AI, can you do this stuff for me?” And it may say, “I think I can prove this.” I don’t think mathematics will become solved. If there was another major breakthrough in AI, it’s possible, but I would say that in three years you will see notable progress, and it will become more and more manageable to actually use AI. And even if AI can do the type of mathematics we do now, it means that we will just move to a higher type of mathematics. So right now, for example, we prove things one at a time. It’s like individual craftsmen making a wooden doll or something. You take one doll and you very carefully paint everything, and so forth, and then you take another one. The way we do mathematics hasn’t changed that much. But in every other type of discipline, we have mass production. And so with AI, we can start proving hundreds of theorems or thousands of theorems at a time. And human mathematicians will direct the AIs to do various things. So I think the way we do mathematics will change, but their time frame is maybe a little bit aggressive.

Comments are closed.