Andrew White 🐦‍⬛ Profile picture
May 8 8 tweets 4 min read Read on X
ChemCrow is out today in @NatMachIntell! ChemCrow is an agent that uses chem tools and a cloud-based robotic lab for open-ended chem tasks. It’s been a journey to get to publication and I’d like to share some history about it. It started back in 2022. 1/8 Image
I was working as a red teamer for GPT-4 and kept getting hallucinated molecules when trying to get up to trouble in chemistry. Then I tried the ReAct agent (from @ShunyuYao12 ) quickly saw real molecules. This work eventually was public in GPT-4 technical report 2/8 Image
The problem with LLM agents in science is that they must be judged in the lab. So I called @pschwllr – the best chemists I know, and the inventor of molecular transformers. We teamed-up and worked together on a plan to improve and test the agent. 3/8 Image
We then brought on the extremely talented @drecmb and @SamCox822 – the co-first authors who developed many of the tools, evaluation ideas, and guardrails to ensure safety, and did the majority of the difficult work. 4/8 Image
We knew that the exciting next step is a cloud lab to automatically execute and test the molecular designs. We teamed up with @OSchilter and @CarloBalda97 – and got to experimental validation in a cloud lab - including having chemcrow design a novel dye. 5/8 Image
Near the end of the ChemCrow Project, I joined with @SGRodriques to found @FutureHouseSF around scientific agents and automated laboratories. @SamCox822 joined shortly after and we followed-up with WikiCrow – an agent that does scientific literature research. 6/8 Image
So what’s up with the crow? Crows can talk – like a parrot – but their intelligence lies in tool use. We're continuing the journey at FutureHouse on building scientific crows and can't wait to share more :) 7/8

• • •

Missing some Tweet in this thread? You can try to force a refresh
 

Keep Current with Andrew White 🐦‍⬛

Andrew White 🐦‍⬛ Profile picture

Stay in touch and get notified when new unrolls are available from this author!

Read all threads

This Thread may be Removed Anytime!

PDF

Twitter may remove this content at anytime! Save it as PDF for later use!

Try unrolling a thread yourself!

how to unroll video
  1. Follow @ThreadReaderApp to mention us!

  2. From a Twitter thread mention us with a keyword "unroll"
@threadreaderapp unroll

Practice here first or read more on our help page!

More from @andrewwhite01

Jun 6, 2023
How can you learn to predict peptide properties without negative examples? This happens often when trying to analyze outputs from screening results. We explore various approaches in this new paper from @MehradAnsari. 1/4

biorxiv.org/content/10.110… Image
Peptide screening usually gives positive examples, which makes it difficult to train a classifier. Previous work has been done on this - including one-class SVM. We evaluate these and propose a modified algorithm built on "spies" 2/4
A spy is a positive example, which you relabel as a negative to help identify a decision boundary. Using a basic convolutional NN, we evaluate our method across multiple tasks where we happen to have negative and positive data. 3/4 Image
Read 4 tweets
Apr 12, 2023
How can you check if a molecule is present in a >10B dataset in 0.2 ms? With bloom filters! Checkout our preprint on bloom filters by @4everstudent95 1/4

Code: github.com/whitead/molblo…
Paper: arxiv.org/abs/2304.05386 Image
Bloom filters are fast and can store ultra large chemical libraries in RAM, at the cost of a false positive rate of 0.005 (can tune this!) 2/4 Image
In the paper, we compared fingerprints vs SMILES and found that bloom filters on SMILES are faster and have better performance. In fact, SMILES follows theoretical performance exactly 3/4 Image
Read 4 tweets
Apr 12, 2023
Our preprint on using GPT-4 as an agent with tools for chemistry is out! We call it ChemCrow. Working with @SamCox822, @drecmb @pschwllr, we developed a set of tools for synthesis/cond, safety, commercial availability, patents, paper-qa

arxiv.org/abs/2304.05376 1/5 Image
We, unsurprisingly, found that GPT-4 with tools is much better than GPT-4 alone. Here it outlines a synthesis for atorvastatin complete with steps, an ingredient list, cost, and suppliers. We implement this with @LangChainAI (great library!) Image
One of the biggest surprises for us was that GPT-4 has trouble evaluating the completions! Comparing the two answers, GPT-4 as an evaluator ranks ChemCrow to be about the same, even though GPT-4 alone fails often. 3/5 Image
Read 6 tweets
Apr 3, 2023
I've been exploring if GPT-4 and other models (please give me a key @AnthropicAI!!) can do "algebra" of molecules. Let's see a few examples 1/4

demo: whitead.github.io/svelte-chem-al…
First - "mutate." Basically create similar molecules from the given molecule. This is interesting in modifying compounds for design or XAI - building out local chemical spaces. 2/4
Second - "add." Combine two molecules. This is interesting for joining fragments in drug discovery, or just for modifying a scaffold. GPT-4 does pretty well here! 3/4
Read 4 tweets
Mar 16, 2023
Can GPT-4 do drug discovery? No, but it can help. Let's walk through GPT-4 proposing new drugs. This is called knowledge-based screening. We're trying to fill a list of plausible compounds that could lead to new drugs based on research papers. 1/n
This is one small step in drug discovery. There are many others! The compounds GPT-4 proposes have to be made and tested, and then they just start a path towards a new drug. Let's do a new example for psoriasis by targeting a known protein TYK2. Here is the prompt. 2/n
I made tools for GPT-4 to use - it will hallucinate when working with molecules directly. I instruct it to rely on these tools. First it does literature searches using one of these tools on the target. 3/n
Read 10 tweets
Feb 10, 2023
My research group's @LangChainAI hackathon projects🙌 Great job to all of them and I hope anyone reading this gets a glimpse into the future of chemistry. These were done in 1 week 1/5
The first from @GWellawatte - input is a protein structure PDB ID and question about the protein and the output is cited answers about it. Works by downloading papers from PDB affiliated with ID. 2/5
Second project from @MehradAnsari - searches arxiv from a question, downloads papers, and uses paper-qa to answer the questions. 3/5
Read 6 tweets

Did Thread Reader help you today?

Support us! We are indie developers!


This site is made by just two indie developers on a laptop doing marketing, support and development! Read more about the story.

Become a Premium Member ($3/month or $30/year) and get exclusive features!

Become Premium

Don't want to be a Premium member but still want to support us?

Make a small donation by buying us coffee ($5) or help with server cost ($10)

Donate via Paypal

Or Donate anonymously using crypto!

Ethereum

0xfe58350B80634f60Fa6Dc149a72b4DFbc17D341E copy

Bitcoin

3ATGMxNzCUFzxpMCHL5sWSt4DVtS8UqXpi copy

Thank you for your support!

Follow Us!

:(