This is the first ever paper to provide causal evidence of the impact of AI on scientific discovery. Something that has been the subject of speculation by macroeconomists for years now.
Automating research & development is different from automating production tasks because the output of R&D (ideas) is infinitely re-usable and non-rival. If we automate idea discovery, we can massively accelerate long-term growth.
The author, a second-year (!) grad student at MIT, shows that scientists in a large US firm create new materials (polymers, biomaterials, alloys, etc.) at a higher rate AND of higher quality, novelty and commercial potential, when equipped with an AI tool.
The AI tool is a Graph Neural Network (not an LLM), trained on the structures of known materials. Scientists input a set of desired features (refraction, tensile strength, etc.), and the GNN generates candidate structures. These candidates then need to be tested IRL (at a cost)
This AI tool is very different from other AI technologies previously studied in the literature, such as in this great paper about how an LLM-powered chatbot improves the productivity of customer support agents (mostly the least productive ones).
In the present paper, the GNN enables scientists to discover 44% more new materials, file 39% more patents, and introduce 17% more product prototypes. All in the space of 7 to 24 months.
No data on commercialization though.
Importantly, the quality of new materials being tested and developed *increases*.
Quality here is the distance between the features targeted by researchers and the actual properties of the materials.
Novelty of the new materials also increases (novelty = how much the atomic structure of the new material needs to be altered to resemble an existing material, this is so cool).
Novelty of patents associated with the materials also increase, based on text (Kelly et al., 2021) or field (Kalyani, 2024) similarity.
R&D efficiency (new prototypes/costs) increases by 13-15%, even when including the costs of training the model.
Importantly, the *more productive* scientists--those initially producing more new materials--are gaining more from the collaboration with the AI. See the shift in new materials discovery distribution (left) and new prototype introduction by decile of productivity (right).
What explains this? More able scientists have a better capacity to evaluate the viability and commercial potential of the new materials suggested by the GNN. The paper ranks scientists by judgment ability based on the step of the idea discovery process they work in.
That is, they can work on (1) idea generation, (2) evaluating the materials or (3) testing the materials (AI intervenes only in 1).
The paper has detailed data on the work logs of the scientists and thus knows which steps they are working on, every day (such incredible data...)
There judgement ability, which is estimated from the logs data, is strongly correlated with self-assessed judging ability reported in a survey of scientists.
And this is how tasks change when the AI is introduced, one of the coolest graphs of the whole paper: idea generation halves, judgement of materials doubles, experimentation slightly increases.
The paper constructs an empirical "discovery curve": the share of suggested materials that are viable as a function of the predicted viability (testing is costly, so scientists rank candidate materials by how likely they are to be viable, this is a subjective judgement)
With AI, the discovery curve shifts up; more candidates end up being viable overall. And it *flattens* a bit; scientists are not as good at identifying viable materials prior to testing, perhaps because the materials are too novel.
Note that lower-ability scientists are as good as chance to identify viable compounds once the GNN is used; the discovery curve is flat. Very, very, very cool result.
In the final months of the roll-out of the AI, the firm fired some workers. Those workers were disproportionately in the lowest quartile of judgment. The workforce expanded on net though, with hires outnumbering employees who were made redundant.
While scientists were more productive, they reported lower job satisfaction, mostly because their skills were under-utilized and because their tasks were more repetitive after the AI.
This result is a bit depressing, and it reminds me of a post I had seen here saying
"I don't want an AI that creates poetry and art while I do laundry and the dishes. I want AI that does the dishes and laundry while I create art and poetry"
(I don't remember who posted it sorry)
The paper is fantastic for several reasons:
- über-timely topic
- first causal estimate of AI on science
- fantastic data (clear measures of the quantity and quality of innovation (very rare), detailed task data, qualitative survey evidence)
- thorough testing of mechanisms
Also, it is so well-written and beautifully presented.
Huge congratulations to the author. This is an extraordinary piece of research.
While the paper is empirical, I believe its findings are a also triumph for economic theory.
First, the task-based approach to modeling the impact of AI on growth seems more apt than ever (see the Aghion, Jones & Jones paper mentioned in the earlier posts).
Second, the expansion of high-judgement ability scientists to judgement tasks and the replacement of scientists by the AI in the idea-generation phase is exactly what a task-based model of the labor market would predict. economics.mit.edu/sites/default/…
In these models, an increase (even relative) in the productivity of workers expands the set of tasks they are performing, displacing other. (see proposition 2 in the paper linked above)
My own opinion, FWIW, is that a lot of the speculation about the economic impact of AI does not take seriously the heterogeneity of tasks that AI can automate. AI is not a homogeneous technology; it is a collection of tools that can displace both high-skill and low-skill workers
So all the forecasts about the impact of AI on inequality seem premature to me. The technological landscape is just not stable enough to know which factors of production will be most affected.
(again, my own opinion, not that of the author of the paper)
This heterogeneity of applications can also explain why results differ so much from paper to paper. LLMs reduce inequality in call centers but maybe GNNs increase inequality in R&D labs.
h/t @AuthorJMac
• • •
Missing some Tweet in this thread? You can try to
force a refresh
2022 has been a fantastic year for papers about growth/innovation and firm heterogeneity. Here is a short thread summarising the ones I enjoyed reading the most.
This is a fascinating paper documenting a new fact about the firm size distribution: its right tail becomes thicker as a country grows i.e. superfirms becomes even more “super” as GDP/cap increases.
This was very surprising (at least to me) as most of the empirical evidence so far suggested that the firm size distribution was stationary.
Chen provides evidence against it from OECD countries, countries on which the World Bank has firm data and in the US time series.
A paper I am really proud of is coming in print! It is joint work with the great @crescenzi_r (LSE) and @FrankNeffke (Harvard Growth Lab).
It asks: What is the impact of foreign investments by firms on innovation in cities? What is the role of firm heterogeneity in this? Short🧵
We start from the observation that innovative cities (as measured by patent counts) are a *very* exclusive club
Inequality in patent production is much higher than income inequality across regions (Lorenz curves below)
10 most innovative regions = 45% of patent prod. in 2012 (!)
Not only are there very few regions who are part of the club, the club of innovative regions has barely changed over the past 40 years.
If you plot total patents produced in recent years v. past production across regions, the correlation is striking (few exceptions in yellow)
With friends at LSE, I recently had a reading group session in memory of Emmanuel Farhi. I discussed some of his influential papers on production networks with David Baqaee.
I thought I'd summarise this discussion here for people interested in this beautiful line of work 👇
Out of the many papers Baqaee & Farhi wrote, 3 are especially inspirational. They are in yellow below:
Paper 1. lays the foundation of their theory of input-output networks
Paper 2. applies it to inefficient economies (=with markups)
Paper 3. shows that the structure of an economy's production network matters a great deal for the amplification of shocks.
Even shocks to only one firm or sector.
And even in a perfectly competitive economy.