Tweet

Matthew Barnett

Mar 4 • 13 tweets • 3 min read

Something that surprised me last year regarding LLMs was their ability to do mathematics well. I now suspect that mathematics is not much harder for computers to understand than ordinary natural language documents. This has pretty interesting implications. 🧵

I was previously too anchored to statements that researchers made about how we weren't making progress.

For example: "While scaling Transformers is automatically solving most other text-based tasks, scaling is not currently solving MATH."
arxiv.org/abs/2103.03874

One thing I didn't understand was that simply allowing models to "think longer" allowed the models to generate much better results. It seems obvious in hindsight, but I didn't realize the magnitude of the effect was so large.
arxiv.org/abs/2211.14275…

I suspect most mathematics-inclined people are not fully aware of how dramatically LLMs could change how mathematics is performed in the coming decade.

Take the problem of proof verification. Right now, almost all mathematicians do their work informally. That is, they write their proofs using informal notation, rather than via a proof assistant.
en.wikipedia.org/wiki/Proof_ass…

This practice is somewhat problematic because it's hard to verify informal mathematical proofs. Papers supposedly proving P!=NP are routinely submitted to journals, including from many with reputable credentials, but it is burdensome to formally verify these claims.

However, LLMs should soon become far more adept at converting informal mathematics into machine-verifiable code. That means that it may soon become possible to quickly verify whether a controversial proof is valid, helping us filter cranks from geniuses.

Unlike other uses for LLMs, mathematics is not as prone to problems of model hallucination. That's because invalid, hallucinated proofs can efficiently be verified as invalid by proof assistants.
arxiv.org/abs/2210.12283

In the longer term, LLMs will outcompete mathematicians outright. Predictors on Metaculus currently expect it will not be very long before we have AI that can get a perfect score on what may be the hardest math competition in the world.
metaculus.com/questions/1167…

When LLMs outperform top mathematicians at proof-generation, professional mathematics will likely become more like recreational mathematics. Humans may still contribute, but their contributions will rarely be seen as groundbreaking.

Some are worried about even more dramatic implications of mathematician AIs. For example, some people seem to think that when computers outperform human mathematicians, they'll be capable of rapid recursive self-improvement.

I'm not so sure about these claims. I don't currently think machine learning progress is severely bottlenecked by mathematical talent. I suspect progress is more bottlenecked by experimentation.

But I don't know. What would the world look like if the best mathematicians in the world are computer programs that we can summon by paying a few dollars to access an API?

• • •

Missing some Tweet in this thread? You can try to force a refresh

This Thread may be Removed Anytime!

Twitter may remove this content at anytime! Save it as PDF for later use!

More from @MatthewJBar

Matthew Barnett

@MatthewJBar

Mar 5

https://twitter.com/JosephPolitano/status/1632029149828202496

I think the idea that automation always increases employment is really wrong. Here's a somewhat substantive thread that critiques this idea. 🧵

https://twitter.com/JosephPolitano/status/1632029149828202496

It's true, for example, that the prime age labor force participation rate in the United States went up in the last 70 years, peaking in the 90s. However, this mostly reflects a shift from women doing non-market labor to holding formal jobs.

While in theory, automation can increase employment in the long-term, there are a bunch of reasons to think that it won't. Automation increases real incomes, which enables more people to retire, allows for a bigger welfare state, and allows more people to pursue education.

Read 6 tweets

Matthew Barnett

@MatthewJBar

Jan 27

https://twitter.com/JgaltTweets/status/1618783360608006145

New estimate of the consumer surplus of the internet just dropped. If it's accurate, then there was no great stagnation.

Let me explain.🧵

https://twitter.com/JgaltTweets/status/1618783360608006145

The great stagnation theory is the idea that economic progress has been slowing down in recent decades. It has often been assumed to be true on the basis of GDP statistics, which show declining rates of per-capita economic growth in recent times.
en.wikipedia.org/wiki/The_Great…

However, many economists have argued that GDP alone is a biased measure of economic welfare, since it underestimates the value of goods that are low in price but provide lots of value to consumers.

Read 10 tweets

Matthew Barnett

@MatthewJBar

Jan 8

One of the most common arguments against AGI being near is the following take: AI has gone through many boom and bust cycles before in which people thought we were close, but we ended up being far. This boom will also bust.

Ultimately, I find this argument quite weak. 🧵

@robinhanson

Perhaps the clearest thinker who uses this argument is @robinhanson. For example, in 2019 he outlined his perspective that in the past we've seen "repeated booms of AI concern and interest" that eventually went bust, going back to (at least) the 1930s.
aiimpacts.org/conversation-w…

This argument makes intuitive sense, and on the surface I think it's totally valid. But what I dispute is how *strong* this evidence is, and whether the model of "repeated booms" actually holds up to a rigorous empirical standard.

Read 18 tweets

Matthew Barnett

@MatthewJBar

Dec 27, 2022

Some improvements we might start to see more in large language models within 2 years:

- Explicit memory that will allow it to retrieve documents and read them before answering questions arxiv.org/abs/2112.04426

- A context window of hundreds of thousands of tokens, allowing the model to read and write entire books arxiv.org/abs/2202.07765

- Dynamic inference computation that depends on the difficulty of the query, allowing the model to "think hard" about difficult questions before spitting out an answer arxiv.org/abs/2207.07061

Read 4 tweets

Matthew Barnett

@MatthewJBar

Dec 20, 2022

There's been a lot of low quality GPT-4 speculation recently. So, here's a relatively informed GPT-4 speculation thread from an outsider who still doesn't know that much. 🧵

In a blog post from 2020, Microsoft announced a new supercomputer for the exclusive purpose of training large ML models for OpenAI. They stated that "Compared with other machines listed on the TOP500 supercomputers in the world, it ranks in the top five".

blogs.microsoft.com/ai/openai-azur…

At the time, the top supercomputer listed by TOP500 was Summit from IBM, though the much faster Supercomputer Fugaku would be listed in the June 2020 TOP500 release one month later.
top500.org/lists/top500/2…

Read 24 tweets

Matthew Barnett

@MatthewJBar

Jul 8, 2022

Disagreement about AI timelines is often framed as a disagreement about the anticipated rate of future AI progress. However, I believe the real disagreement is often not about the rate of progress, but about the threshold required for AI to be transformative.

This disagreement manifests, for example, in Eliezer Yudkowsky's statement that it would "not surprise [him] in the least" if AGI is created and destroys the world before consumers are able to purchase self-driving cars.
lesswrong.com/posts/7im8at9P…

My general take on this is that people who believe in short timelines are seriously underestimating the capability required to take over the world.

Read 4 tweets

Support us! We are indie developers!

This site is made by just two indie developers on a laptop doing marketing, support and development! Read more about the story.

Become a Premium Member ($3/month or $30/year) and get exclusive features!

Become Premium

Don't want to be a Premium member but still want to support us?

Make a small donation by buying us coffee ($5) or help with server cost ($10)

Donate via Paypal

Or Donate anonymously using crypto!

Ethereum

0xfe58350B80634f60Fa6Dc149a72b4DFbc17D341E copy

Bitcoin

3ATGMxNzCUFzxpMCHL5sWSt4DVtS8UqXpi copy

Thank you for your support!

Share this page!

Matthew Barnett

People who liked this thread also liked...

Try unrolling a thread yourself!

More from @MatthewJBar

Matthew Barnett

Matthew Barnett

Matthew Barnett

Matthew Barnett

Matthew Barnett

Matthew Barnett

Did Thread Reader help you today?

Don't want to be a Premium member but still want to support us?

Send Email!