Andrew White 🐦‍⬛ Profile picture
Head of Sci/cofounder @FutureHouseSF. Prof of chem eng @UofR (on sabbatical). Automating science with AI and robots in biology. Corvid enthusiast
Jerome Ku Profile picture 1 subscribed
May 8 8 tweets 4 min read
ChemCrow is out today in @NatMachIntell! ChemCrow is an agent that uses chem tools and a cloud-based robotic lab for open-ended chem tasks. It’s been a journey to get to publication and I’d like to share some history about it. It started back in 2022. 1/8 Image I was working as a red teamer for GPT-4 and kept getting hallucinated molecules when trying to get up to trouble in chemistry. Then I tried the ReAct agent (from @ShunyuYao12 ) quickly saw real molecules. This work eventually was public in GPT-4 technical report 2/8 Image
Jun 6, 2023 4 tweets 2 min read
How can you learn to predict peptide properties without negative examples? This happens often when trying to analyze outputs from screening results. We explore various approaches in this new paper from @MehradAnsari. 1/4

biorxiv.org/content/10.110… Image Peptide screening usually gives positive examples, which makes it difficult to train a classifier. Previous work has been done on this - including one-class SVM. We evaluate these and propose a modified algorithm built on "spies" 2/4
Apr 12, 2023 4 tweets 2 min read
How can you check if a molecule is present in a >10B dataset in 0.2 ms? With bloom filters! Checkout our preprint on bloom filters by @4everstudent95 1/4

Code: github.com/whitead/molblo…
Paper: arxiv.org/abs/2304.05386 Image Bloom filters are fast and can store ultra large chemical libraries in RAM, at the cost of a false positive rate of 0.005 (can tune this!) 2/4 Image
Apr 12, 2023 6 tweets 3 min read
Our preprint on using GPT-4 as an agent with tools for chemistry is out! We call it ChemCrow. Working with @SamCox822, @drecmb @pschwllr, we developed a set of tools for synthesis/cond, safety, commercial availability, patents, paper-qa

arxiv.org/abs/2304.05376 1/5 Image We, unsurprisingly, found that GPT-4 with tools is much better than GPT-4 alone. Here it outlines a synthesis for atorvastatin complete with steps, an ingredient list, cost, and suppliers. We implement this with @LangChainAI (great library!) Image
Apr 3, 2023 4 tweets 2 min read
I've been exploring if GPT-4 and other models (please give me a key @AnthropicAI!!) can do "algebra" of molecules. Let's see a few examples 1/4

demo: whitead.github.io/svelte-chem-al… First - "mutate." Basically create similar molecules from the given molecule. This is interesting in modifying compounds for design or XAI - building out local chemical spaces. 2/4
Mar 16, 2023 10 tweets 4 min read
Can GPT-4 do drug discovery? No, but it can help. Let's walk through GPT-4 proposing new drugs. This is called knowledge-based screening. We're trying to fill a list of plausible compounds that could lead to new drugs based on research papers. 1/n This is one small step in drug discovery. There are many others! The compounds GPT-4 proposes have to be made and tested, and then they just start a path towards a new drug. Let's do a new example for psoriasis by targeting a known protein TYK2. Here is the prompt. 2/n
Feb 10, 2023 6 tweets 3 min read
My research group's @LangChainAI hackathon projects🙌 Great job to all of them and I hope anyone reading this gets a glimpse into the future of chemistry. These were done in 1 week 1/5 The first from @GWellawatte - input is a protein structure PDB ID and question about the protein and the output is cited answers about it. Works by downloading papers from PDB affiliated with ID. 2/5
Feb 10, 2023 5 tweets 2 min read
OK, kids are in bed. Time to learn @Gradio. Wish me luck! @Gradio Some notes: wish there was a default to have python error messages show up somewhere (maybe a standard component)
Feb 2, 2023 6 tweets 2 min read
I just paid $60 to embed the text of the entire lord of the rings trilogy so I could have GPT answer a question I've wondered my whole evening: Do the people of Middle Earth poop? 1/6 I did this using @gpt_index and @LangChainAI to bring up all the relevant passages from the book and combine into a chain of prompts answered by GPT3.5 There were not many relevant passages to work with, so the model had some trouble. 2/6
Aug 7, 2022 6 tweets 3 min read
New preprint on pre-trained models for Bayesian optimization (BO) of sequences! We show LLMs trained on protein seqs can replace Gaussian processes in BO. Examples: BO of peptide inhibitors with AlphaFold and iterative design of proteins. 1/6
biorxiv.org/content/10.110… We wanted to combine few-shot capabilities of pre-trained models with BO. We found deep ensembles can give LLMs uncertainty and the reparameterization trick enables gradients on sequences. This enables explore/exploit of BO with accuracy of LLMs. 2/6
Apr 13, 2022 10 tweets 3 min read
I've put together a few of my favorite discussions on the details of doing molecular dynamics. I'll add more as they come. Hopefully they're useful to you! 🧵1/n 2/n A discussion about assessing uncertainty in metadynamics
Aug 31, 2021 7 tweets 5 min read
1/6 For the last few months @glenhocky and I have been asking what large language models (LLM) can do for chemistry. In our new preprint, we show LLMs know a bit of chemistry and can do a lot: like compute the dissociation curve of H2.
arxiv.org/abs/2108.13360 ImageImage 2/6 LLMs that can generate code have reached accuracy that makes them usable in research. In their training, they picked up knowledge of chemistry. If you ask @OpenAI's Codex to draw caffeine it knows both how to draw a molecule and the structure of caffeine. Image