Posted Friday February 3 2023.

Proving mathematical theorems on computers is still apparently pretty hard. The most well-known successes are maybe the four color theorem, the Kepler conjecture, and the Liquid Tensor Experiment. One interesting metric is Freek Wiedijk’s list of 100 mathematical theorems and their formalization progress in various languages. Currently Isabelle and HOL Light are tied for first at 87, but my personal favorite is Lean with its extensive mathematical library, mathlib.

While *automated theorem proving* (ATP) refers to automatically finding the *proofs* of theorems, given their statements, *autoformalization* refers to converting natural language into formal mathematics. It’s no surprise that autoformalization didn’t get as much attention as ATP until recently, because without *language models* like GPTs, it’s basically impossible!

One can imagine an ideal AF system: input an old mathematics paper and out comes its formalization in some formal language of your choosing. Not only that, but once you’ve formalized all the existing papers, you get a whole formal database to search! Anyway, the main obstacle seems to be *lack of labeled data*, which would be natural language/formal statement pairs. There seems to be hope, though - even if you only encode the *statements* of theorems by hand (not their proofs), this would still be useful as you could retroactively rediscover many proofs, maybe using the natural language proof as a conditioning variable.

Sites like Math Stack Exchange (MSE) have a pretty strict question/answer format where every question needs to be well-defined to where it can be answered precisely. But this is precisely the form of the natural language theorem/proof pairs needed to train an AF system. Could MSE assist in this effort by allowing volunteers to transcribe questions and answers to their corresponding formalization?

First Experiments with Neural Translation of Informal to Formal Mathematics by Qingxiang Wang, Cezary Kaliszyk, and Josef Urban (2018).

Autoformalization with Large Language Models by Yuhuai Wu, Albert Q. Jiang, Wenda Li, Markus N. Rabe, Charles Staats, Mateja Jamnik, and Christian Szegedy (2022).

A Promising Path Towards Autoformalization and General Artificial Intelligence by Christian Szegedy (2020). (paywalled… wait a minute, didn’t this guy invent batch normalization…?)

Exploration of Neural Machine Translation in Autoformalization of Mathematics in Mizar by Qingxiang Wang, Chad Brown, Cezary Kaliszyk, and Josef Urban (2019).

The work on univalent foundations seems relevant, but I can’t see how. Need to explore more.