Post

Arnaud Dyevre

@ArnaudDyevre

Nov 11 • 31 tweets • 8 min read • Read on X

https://twitter.com/calebwatney/status/1855016577646666123

I just read the paper in full; it is even more spectacular than I initially thought.
A short thread about the results and their significance.

https://twitter.com/calebwatney/status/1855016577646666123

This is the first ever paper to provide causal evidence of the impact of AI on scientific discovery. Something that has been the subject of speculation by macroeconomists for years now.

Automating research & development is different from automating production tasks because the output of R&D (ideas) is infinitely re-usable and non-rival. If we automate idea discovery, we can massively accelerate long-term growth.

The author, a second-year (!) grad student at MIT, shows that scientists in a large US firm create new materials (polymers, biomaterials, alloys, etc.) at a higher rate AND of higher quality, novelty and commercial potential, when equipped with an AI tool.

The AI tool is a Graph Neural Network (not an LLM), trained on the structures of known materials. Scientists input a set of desired features (refraction, tensile strength, etc.), and the GNN generates candidate structures. These candidates then need to be tested IRL (at a cost)

This AI tool is very different from other AI technologies previously studied in the literature, such as in this great paper about how an LLM-powered chatbot improves the productivity of customer support agents (mostly the least productive ones).

In the present paper, the GNN enables scientists to discover 44% more new materials, file 39% more patents, and introduce 17% more product prototypes. All in the space of 7 to 24 months.
No data on commercialization though.

Importantly, the quality of new materials being tested and developed *increases*.
Quality here is the distance between the features targeted by researchers and the actual properties of the materials.

Novelty of the new materials also increases (novelty = how much the atomic structure of the new material needs to be altered to resemble an existing material, this is so cool).

Novelty of patents associated with the materials also increase, based on text (Kelly et al., 2021) or field (Kalyani, 2024) similarity.

R&D efficiency (new prototypes/costs) increases by 13-15%, even when including the costs of training the model.

Importantly, the *more productive* scientists--those initially producing more new materials--are gaining more from the collaboration with the AI. See the shift in new materials discovery distribution (left) and new prototype introduction by decile of productivity (right).

What explains this? More able scientists have a better capacity to evaluate the viability and commercial potential of the new materials suggested by the GNN. The paper ranks scientists by judgment ability based on the step of the idea discovery process they work in.

That is, they can work on (1) idea generation, (2) evaluating the materials or (3) testing the materials (AI intervenes only in 1).
The paper has detailed data on the work logs of the scientists and thus knows which steps they are working on, every day (such incredible data...)

There judgement ability, which is estimated from the logs data, is strongly correlated with self-assessed judging ability reported in a survey of scientists.

And this is how tasks change when the AI is introduced, one of the coolest graphs of the whole paper: idea generation halves, judgement of materials doubles, experimentation slightly increases.

The paper constructs an empirical "discovery curve": the share of suggested materials that are viable as a function of the predicted viability (testing is costly, so scientists rank candidate materials by how likely they are to be viable, this is a subjective judgement)

With AI, the discovery curve shifts up; more candidates end up being viable overall. And it *flattens* a bit; scientists are not as good at identifying viable materials prior to testing, perhaps because the materials are too novel.

Note that lower-ability scientists are as good as chance to identify viable compounds once the GNN is used; the discovery curve is flat. Very, very, very cool result.

In the final months of the roll-out of the AI, the firm fired some workers. Those workers were disproportionately in the lowest quartile of judgment. The workforce expanded on net though, with hires outnumbering employees who were made redundant.

While scientists were more productive, they reported lower job satisfaction, mostly because their skills were under-utilized and because their tasks were more repetitive after the AI.

This result is a bit depressing, and it reminds me of a post I had seen here saying
"I don't want an AI that creates poetry and art while I do laundry and the dishes. I want AI that does the dishes and laundry while I create art and poetry"
(I don't remember who posted it sorry)

The paper is fantastic for several reasons:
- über-timely topic
- first causal estimate of AI on science
- fantastic data (clear measures of the quantity and quality of innovation (very rare), detailed task data, qualitative survey evidence)
- thorough testing of mechanisms

Also, it is so well-written and beautifully presented.

Huge congratulations to the author. This is an extraordinary piece of research.

While the paper is empirical, I believe its findings are a also triumph for economic theory.
First, the task-based approach to modeling the impact of AI on growth seems more apt than ever (see the Aghion, Jones & Jones paper mentioned in the earlier posts).

Second, the expansion of high-judgement ability scientists to judgement tasks and the replacement of scientists by the AI in the idea-generation phase is exactly what a task-based model of the labor market would predict.
economics.mit.edu/sites/default/…

In these models, an increase (even relative) in the productivity of workers expands the set of tasks they are performing, displacing other. (see proposition 2 in the paper linked above)

My own opinion, FWIW, is that a lot of the speculation about the economic impact of AI does not take seriously the heterogeneity of tasks that AI can automate. AI is not a homogeneous technology; it is a collection of tools that can displace both high-skill and low-skill workers

So all the forecasts about the impact of AI on inequality seem premature to me. The technological landscape is just not stable enough to know which factors of production will be most affected.
(again, my own opinion, not that of the author of the paper)

This heterogeneity of applications can also explain why results differ so much from paper to paper. LLMs reduce inequality in call centers but maybe GNNs increase inequality in R&D labs.

h/t @AuthorJMac

• • •

Missing some Tweet in this thread? You can try to force a refresh

Share this page!

Enter URL or ID to Unroll

Arnaud Dyevre

Try unrolling a thread yourself!

More from @ArnaudDyevre

Arnaud Dyevre

Arnaud Dyevre

Arnaud Dyevre

Did Thread Reader help you today?

Don't want to be a Premium member but still want to support us?

Send Email!