(1/5) The latest from our group: Jenna Fromer's overview of computer-aided multi-objective optimization in small molecule discovery is now online & open access @Patterns_CP | doi.org/10.1016/j.patt…#compchem
(2/5) Our focus here is on Pareto optimization. Pareto optimization introduces additional algorithmic complexities, but reveals more information about the trade-offs between objectives and is more robust than scalarization approaches
(3/5) We highlight the extensions from single-objective Bayesian optimization to multi-objective Bayesian optimization when choosing molecules from a discrete library. The primary difference is the definition of the acquisition function, with a few options listed in the fig above
(4/5) We also describe the main categories of approaches in multi-objective generative design, _most of which_ follow the paradigm of "iterative distribution learning", & illustrate them through a few case studies
(5/5) Molecular design is fundamentally a multi-objective problem. We hope this will be a useful reference for folks looking to get into computer-aided molecular design and/or move away from scalarization
• • •
Missing some Tweet in this thread? You can try to
force a refresh
1/ Learning compound-protein interactions (CPI) w/ sequence & compound alone is tantalizing, but time and time again, CPI models fail to beat simple baselines. Does this paper do so successfully? An analysis from @samgoldman19 :
2/ The short answer is no, this model also fails to outperform a simple nearest neighbor baseline. The metabolic models presented in the paper are likely independently valuable, but are *not* enabled by deep learning
3/ Using the same exact splits, we made predictions with a KNN model by a weighted average of sequence and substrate distance. We perfectly match DLKCat performance on the test set (SI 5A,B). See our gist here