My colleague Boris Shaliach has been working for years on computerized formalization of mathematics. After making a name for himself in the 1980’s in the, then hot, AI topic of automatic theorem proving, and getting tenure, he has since continued working on extending this automatic theorem proving to the whole of mathematics. His work is certainly no longer main stream AI, and he does not “play” the usual academic game: he hardly published anything since the late 80’s, and to this date he does not have a web page. I have to admit that most of us in the department did not believe that anything would come out of this research, politely calling it “high risk high yield”. However as he is an excellent teacher, was able to get a steady stream of European grants, advises a very large number of graduate students, and in general is friendly and interesting to talk to, everyone likes him.
For years he has been trying to convince everyone that mathematics will soon be done only by computers. His basic argument was that contrary to our intuitive feeling that mathematics requires the peak of human intelligence, and that the cleverness seen in mathematical proofs can never be replaced by a computer, the truth is exactly opposite. Mathematics is so difficult for humans precisely because of human limitations: evolution has simply not optimized humans for proving theorems, and this is exactly where computers should beat humans easily. His example was always how chess turned out to be easy for computers while natural language understanding or face recognition turned out to be much harder because humans seem to be “optimized” for these tasks. (I have to admit that he held this position even before computers beat the human world champion, pointing out that very early chess programs already beat him.) In this vein, I could not get him too excited about the recent amazing advances in computerized Go, and he was much more impressed by the success of Google’s language translation and by IBM’s Watson.
His baby has always been a formalized mathematics computer program, called URIM-VETUMIM, that defined a computer-readable language for specifying mathematical theorems and proofs. He was one of the original group behind the QED manifesto, and is one of not too many researchers still actively working on that goal of formalizing existing mathematics into a computerized readable form. It seems that there is a small industry of computer systems that allow one to formally describe mathematical axiomatic systems, theorems and proofs, and can then verify syntactically the correctness of a proof. His system, like many in the genre, uses the Zermelo-Fraenkel set theory as its axiomatic system and he has spent an enormous amount of effort trying to enter larger and larger pieces of mathematics formally into the system. In fact, for about 20 years he has provided jobs for many undergraduate students in the math department who were paid for taking standard math textbooks and formalizing them into his URIM-VETUMIM system. This turned out to be much harder than it seems where at the beginning, months would be required for formalizing one little lemma, but over time this became much more efficient, as his steady stream of CS M.Sc. students improved the human interface, adding various short cuts (glorified macros) and sometimes even a little intelligence in the form of proof assistants. One of the nice things about his URIM-VETUMIM system is that he never worried about students making errors: if it “compiled” into his system then it was correct.
I got involved in his efforts, helping him with the incentives engineering of replacing our math students with a “crowd-sourcing” alternative. His idea was to utilize Chinese math students who are very good and yet are willing to work for really low wages. He had some contacts in leading Math departments in China, and I helped him a bit as he set up a nice Mechanical-Turk-like system optimized for having Chinese math students enter theorems and proofs into the URIM-VETUMIM format.
I used to have long discussions with him about what the point of this formalization was: after all he is taking mathematical knowledge that we already have and translating it into a clearly less efficient computerized format — so what’s the gain? He had two points: first he claims that most proofs in published mathematical papers are buggy. Usually these are “small” bugs that can be fixed by any mathematician reading the paper, but often there are significant ones. He always pointed out the absurdity that important mathematical proofs (like for Fermat’s last theorem) required more than a year of “verification” by expert mathematicians — really of fixing various holes in them — before they are believed. His analogy was that writing a mathematical proof was similar to writing a computer program, but there was no “compiler” for mathematical theorems that allowed one to verify correctness. His second point was that that one first needs a large database of formalized existing mathematical knowledge before one may hope for new computer-invented proofs.
Unfortunately, the various groups attempting the formalization of mathematics have not agreed on a common language, so each group is working on its own pet language, and none of them seems to have reached critical mass so far. There seems to be some kind of informal competition between the groups about who can formalize more math into their system. Luckily most of the large systems have either made their database publicly available on the web or were willing to provide Boris with their data. Indeed in the last decade or so, many of his graduate students built systems that “translated” from the databases of his competitors into his own URIM-VETUMIM format.
Now comes the interesting stuff. Over the years many of his graduate students did work on various automatic theorem proving ideas, and kept writing more and more programs, each of them doing something better than the previous ones (and other things worse…) Over the years has has amassed literally dozens of programs that can automatically prove mathematical theorems, each one with some non-trivial yet very limited capabilities. Amazingly he was able to keep all the software of his former students functional and kept running every program on the updated mathematical URIM-VETUMIM database as it grew. His latest project was to combine all these provers in a “meta-algorithm” of the form that won the netflix challenge where the idea is to combine all the existing algorithms in his arsenal by running them all and choosing the “best” at each point. Of course the actual implementation is much more nuanced as involves using the large existing database to statistically determine which algorithms are best attempted in which cases, I suppose somewhat similarly to how SATzilla wins the SAT solving contests. However the combination of partial results from the various algorithms into single “proof” seems to be rather straight forward since they all just provide mathematical “lemmas” in the URIM-VETUMIM format. His meta-algorithms now constantly run in the background of his PC cluster, churning new input to the database as well as fine tuning the parameters of the meta-algorithms.
In the last year he has taken this a step forward and has designed a clear API for adding external theorem proving algorithms that are then automatically incorporated into his meta-algorithms. Building on his Mechanical-Turk-for Chinese-Math-Students setup, he is now crowd-sourcing sub-algorithms for his meta-theorem-proving-algorithm, this time from Chinese CS students. By now he has paid tens of thousands of dollars to hundreds of Chinese students who have contributed pieces of code. It seems that at this point it is very difficult to keep track of the contents of the URIM-VETUMIM system: both the mathematical knowledge data and the software have been build over the course of more than 20 years, with different pieces of it coming from various sources. This seems to be fine as Boris guards closely the core of the system: the URIM-VETUMIM language specification and the proof verifier.
Now comes the punchline: the system just spat out a “proof” of the Reimann hypothesis. The “proof” is a 200Mb long text file that is human-readable in the sense of understanding each line, but with no structure that a human can really understand. It is clear that the proven “statement” is that of the Reimman hypothesis, and the “proof” is verified by the software to be a formally correct proof of the theorem using Zermello-Fraenlkel set theory (with the axiom of choice, I think).