Google DeepMind have produced a program that in a certain sense has achieved a silver-medal peformance at this year's International Mathematical Olympiad. 🧵
It did this by solving four of the six problems completely, which got it 28 points out of a possible total of 42. I'm not quite sure, but I think that put it ahead of all but around 60 competitors.
However, that statement needs a bit of qualifying.
The main qualification is that the program needed a lot longer than the human competitors -- for some of the problems over 60 hours -- and of course much faster processing speed than the poor old human brain.
If the human competitors had been allowed that sort of time per problem they would undoubtedly have scored higher.
Nevertheless, (i) this is well beyond what automatic theorem provers could do before, and (ii) these times are likely to come down as efficiency gains are made.
Another qualification is that the problems were manually translated into the proof assistant Lean, and only then did the program get to work. But the essential mathematics was done by the program: just the autoformalization part was done by humans.
As with AlphaGo, the program learnt to do what it did by teaching itself. But for that it needed a big collection of problems to work on. They achieved that in an interesting way: they took a huge database of IMO-type problems and got a large language model to formalize them.
However, LLMs are not able to autoformalize reliably, so they got them to autoformalize each problem many times. Some of the formalizations were correct, but even the incorrect ones were useful as training data, as often they were easier problems.
It's not clear what the implications of this are for mathematical research. Since the method used was very general, there would seem to be no obvious obstacle to adapting it to other mathematical domains, apart perhaps from insufficient data.
So we might be close to having a program that would enable mathematicians to get answers to a wide range of questions, provided those questions weren't *too* difficult -- the kind of thing one can do in a couple of hours.
That would be massively useful as a research tool, even if it wasn't itself capable of solving open problems.
Are we close to the point where mathematicians are redundant? It's hard to say. I would guess that we're still a breakthrough or two short of that.
It will be interesting to see how the time the program takes scales as the difficulty of the problems it solves increases. If it scales with a similar ratio to that of a human mathematician, then we might have to get worried.
But if the function human time taken --> computer time taken grows a lot faster than linearly, then more AI work will be needed.
The fact that the program takes as long as it does suggests that it hasn't "solved mathematics".
However, what it does is way beyond what a pure brute-force search would be capable of, so there is clearly something interesting going on when it operates. We'll all have to watch this space.
@PaulsonJonathan So no cheating took place there, but the method that the program used to work out what it needed to prove was quite a bit more brute-force-like than the method a human mathematician would typically use.
• • •
Missing some Tweet in this thread? You can try to
force a refresh
I'm very sad to hear that Daniel Dennett has died. I greatly enjoyed his books Consciousness Explained and Elbow Room, and I hope I won't annoy too many people if I express my opinion that what he said in those books was basically right. 1/
For instance, I agree with him that computers could in principle be conscious (but would be very cautious about making such a claim of an actual computer), and also that free will can be reasonably defined in a way that makes it entirely compatible with determinism. 2/
I briefly met him once, after lunch at Trinity where he had been a guest of Simon Blackburn. IIRC he had been giving a talk on why religion exists: he politely listened to my suggestion that confusing correlation with causation might have had a lot to do with it. 3/
Today I start my seventh decade, so here are a few reflections on what it's like to reach that milestone. 🧵
1. I've had an extremely fortunate life so far. Of course, nobody reaches the age of 60 without some bad things happening, but I've had a lot less bad than my fair share and a lot more good.
2. With each passing decade I get that much more aware of the finiteness of my life, but turning 60 is a big step up in that respect from turning 50. I have people basically my age talking about retirement, for example.
My son has just started calculus, and I asked him what the relationship was between the gradients of the tangent and the normal to a curve at a given point. His first reply was, "They are perpendicular." I've noticed many times that something one gains with experience ... 1/7
in mathematics is an acute sensitivity to types. An experienced mathematician could not give that answer, for the simple reason that gradients are real numbers and two real numbers cannot be perpendicular to each other. 2/7
It didn't take long for him to correct himself and give the answer I had been looking for, but the point remains: get someone into the habit of being aware of the type of everything they are talking about, and their mathematical thinking automatically becomes much clearer. 3/7
I have often seen statistics like this, and am very much in favour of curbing the high-emitting activities of the rich (and while there are plenty of people richer than I am, I am not excluding myself from the people whose emissions must be curbed).
there is an important calculation that economists must have done somewhere, which I have not managed to find, concerning what the effects would be on emissions of a big redistribution of wealth. 2/
On the face of it, the argument looks simple: the rich are responsible for the lion's share of emissions, so if we redistributed in such a way as to bring the rich down to a lower level of wealth, we would have made big progress, by dealing with that lion's share. 3/
It's an amazing time to be alive for a combinatorialist at the moment, with a number of long-standing problems, several of them personal favourites of mine, being resolved. Today I woke up to the news of yet another breakthrough, due to Sam Mattheus and Jacques Verstraete. 🧵
A month or two ago I tweeted about a stunning new result that obtained an exponential improvement for the upper bound for the Ramsey number R(k,k), a problem I had thought about a lot. When I felt stuck on that, I would sometimes turn my attention to a related problem
that felt as though it ought to be easier: estimating the Ramsey number R(4,k). This is the smallest n such that every graph contains either a K_4 (that is, four vertices all joined to each other) or an independent set of size k (that is, k vertices not joined at all).
I was at a sensational combinatorics seminar in Cambridge yesterday, reminiscent of the time I had been tipped off that Andrew Wiles's seminar at the Newton Institute on Wednesday 23rd June 1993 might be worth going to. 🧵
The speaker was my colleague Julian Sahasrabudhe, who announced that he, Marcelo Campos, Simon Griffiths and Rob Morris had obtained an exponential improvement to the upper bound for Ramsey's theorem.
Ramsey's theorem says that for any number k there is a number R(k) with the property that if you have R(k) people in a room and any two of them are either friends or enemies, then you can find k people who are either all friends of each other or all enemies of each other.