In this paper, we present SynKB, the largest open-source, automatically extracted knowledge base of chemical synthesis protocols, which searches over 6 million chemical synthesis procedures collected from patents.
(2/n)
By taking advantage of recent NLP advances for procedural texts, SynKB supports more flexible queries about reaction conditions and thus has the potential to help chemists search the literature for conditions used in relevant reactions as they design new synthetic routes.
(3/n)
When compared with proprietary datasets like Reaxys, SynKB shows comparable or even better recall for many queries while maintaining high precision.
(4/n)
In addition, our analysis shows that SynKB can be a good complement to proprietary chemistry databases in terms of contained content and search modalities (e.g., less than 20% of retrieved answers are shared between SynKB and Reaxys).