Director and CEO at FutureHouse. Building an AI scientist. https://t.co/rQYoPOxsYo
2 subscribers
Sep 11, 2024 • 6 tweets • 3 min read
Introducing PaperQA2, the first AI agent that conducts entire scientific literature reviews on its own.
PaperQA2 is also the first agent to beat PhD and Postdoc-level biology researchers on multiple literature research tasks, as measured both by accuracy on objective benchmarks and assessments by human experts. We are publishing a paper and open-sourcing the code.
This is the first example of AI agents exceeding human performance on a major portion of scientific research, and will be a game-changer for the way humans interact with the scientific literature.
Paper and code are below, and congratulations in particular to @m_skarlinski, @SamCox822, @jonmlaurent, James Braza, @MichaelaThinks, @mjhammerling, @493Raghava, @andrewwhite01, and others who pulled this off. 1/
PaperQA2 finds and summarizes relevant literature, refines its search parameters based on what it finds, and provides cited, factually grounded answers that are more accurate on average than answers provided by PhD and postdoc-level biologists. When applied to answer highly specific questions, like this one, it obtains SOTA performance on LitQA2, part of LAB-Bench focused on information retrieval. 2/
Oct 12, 2021 • 13 tweets • 4 min read
It’s amazing to see the FRO concept get off the ground. Congratulations to @AdamMarblestone, @AGamick , @SchmidtFutures, and everyone else involved!! Everyone needs to be paying attention to this. For people who aren’t familiar with FROs, I’ll provide some background here.
Academia is great at creating new technologies but not at scaling them up. FROs are a new non-profit science funding structure proposed first in my thesis (dspace.mit.edu/handle/1721.1/…) and then in more detail together with Adam in the @Day1Project paper (dayoneproject.org/post/focused-r…).