Profile picture
Noel O'Boyle @baoilleach
, 10 tweets, 2 min read Read on Twitter
#11thICCS Jochen Sieg taking about bias control in structure-based virtual screening with machine learning
When you build a model in SBVS, is the predictor really generalizing? High correlation does not imply causation. We want to distinguish between patterns in the data we want to learn (causal patterns) and those we don't want to learn (non-causal).
A non-causal example would be one based entirely on the molecular weight.
Datasets: DUD, DUD-E, MUV. What they have in common is the attempt to unbias versus certain features, e.g. MW, LogP.

The original DUD missed out on unbiasing net charges. We've used our SMARTS Miner on DUD, and it's possible to find discriminative patterns.
...so bias depends on the combination of dataset, method, descriptor and other methods.

(ed: I missed something, what are the 5 unbiased features?)
DeepVS is a literature example of a docking-based convolutional neural network. Validated with DUD. Reported results are almost as good with and without protein information. We suspected a non-causal bias.
We believe that DeppVS learns the 2d dissimilarity in the DUD dataset.
Conclusion is that a particular dataset's unbiasing technique may not work with different descriptors and methods. (ed: I think he's saying that this is not a problem with DUD, but with how people use DUD for things other than it was unbiased for)
(ed: personally I have always regarded these unbiased datasets as maximally biased, since they choose actives and inactives in a different way - I think that's the underlying cause of the results shown here)
How to build better datasets? Avoid non-causal data patterns, match simple properties, or use uniform sampling in simple descriptor space. Use baseline experiments to find and remove problems.
Missing some Tweet in this thread?
You can try to force a refresh.

Like this thread? Get email updates or save it to PDF!

Subscribe to Noel O'Boyle
Profile picture

Get real-time email alerts when new unrolls are available from this author!

This content may be removed anytime!

Twitter may remove this content at anytime, convert it as a PDF, save and print for later use!

Try unrolling a thread yourself!

how to unroll video

1) Follow Thread Reader App on Twitter so you can easily mention us!

2) Go to a Twitter thread (series of Tweets by the same owner) and mention us with a keyword "unroll" @threadreaderapp unroll

You can practice here first or read more on our help page!

Did Thread Reader help you today?

Support us! We are indie developers!


This site is made by just three indie developers on a laptop doing marketing, support and development! Read more about the story.

Become a Premium Member and get exclusive features!

Premium member ($3.00/month or $30.00/year)

Too expensive? Make a small donation by buying us coffee ($5) or help with server cost ($10)

Donate via Paypal Become our Patreon

Thank you for your support!