Philippe Schwaller (he/him) Profile picture
Tenure-Track Assistant Professor of Digital Chemistry at @EPFL with @SchwallerGroup | @NCCR_Catalysis | prev @IBMResearch @forRXN | ML/AI-accelerated Chemistry

Dec 28, 2019, 9 tweets

Looking for a weekend/holiday read?
Happy to share this major update of our #NeurIPS2019 #ML4PS workshop paper on chemical reaction classificaction (but not only.. 🧪⚗️🌍). @IBMResearch @unibern #compchem #RealTimeChem

Summary thread ⬇️:

We compared different RXN classification methods. 📍Using a BERT model borrowed from NLP, we matched the ground truth (Pistachio, @nmsoftware) with an accuracy of 98.2%.

We did not only visualize what was important for the class predictions by looking at the different attention weights...

... but also mapped the chemical reaction space using the embeddings of our RXN BERT classifier (RXN fingerprint):

We investigated different combinations of RXN fingerprints: a) unsupervised, b) hand-crafted, c) supervised, d) a + b merged, and e) b + c merged.

And showed that our BERT RXN fingerprint can be used for efficient nearest neighbor searches in the reaction space without knowing the reaction center or distinguishing between reactants and reagents. Examples:

The similar precursors, products and reaction centers can be recognized even by non-experts.

For more examples and info read our manuscript. Made with love by @pschwllr, @skepteis, @acvaucher, Vishnu, @teodorolaino and @jrjrjlr.

Throughout this project we used (among others) the following tools and libraries: @PyTorch, @TensorFlow, @huggingface transformers, OpenNMT, @RDKit_org, @the_cdk and TMAP. Many thanks!

Share this Scrolly Tale with your friends.

A Scrolly Tale is a new way to read Twitter threads with a more visually immersive experience.
Discover more beautiful Scrolly Tales like this.

Keep scrolling