1/n Super proud of work published this week by @anzheng25 et al. in @NatMachIntell using #deeplearning to identify sequence context features predictive of transcription factor binding. rdcu.be/cdMmE Some key points:
2/n The main idea: TFs typically bind short motifs of 6-12bp. But only a small fraction of motifs in the genome are actually bound. How well can the question of “to bind or not to bind” be predicted by sequence context (1kb) around the motif using #DeepSea style CNNs?
3/n Pretty well! For most TFs we tried, we could predict whether its motif was bound based on ChIP-seq very well (mean auROC ~0.94) just from local sequence context
4/n We also did a bunch of simulations to compare model interpretation methods to find out *why* we predict some sequences to be bound and others not. We looked at Grad-CAM, #DeepLIFT, saliency maps, and in silico saturation mutagenesis. Which is best?
5/n Answer: depends. In silico mutagenesis typically does best but is really slow. DeepLIFT was better at finding important regions, while Grad-CAM seemed to have better base-pair resolution to score important bases.
6/n But, how you choose training datasets for these models **really matters**. E.g. when randomly choosing “negative” (unbound) sequences, we mostly learn “pioneer factor” motifs also predictive of open chromatin, probably only indirectly related to the target TF.
7/n On the other hand, conditioning all training data to be within open chromatin regions results in learning very different features, probably more directly related to the TF such as co-binding partners. This also really reduced our prediction accuracy as expected.
8/n Thinking about applying #deeplearning to interpret #GWAS and other variants: depending on how you train the model, you can end up with very different variant-level scores. I think there’s more work to do to figure out the best way to apply these for variant interpretation.
9/n On a related note, I am very proud of @anzheng25 for spearheading our lab’s first paper that is *not* about STRs 😀. (although, we’ve been thinking about how to apply these methods to them as well. Ideas welcome!)

• • •

Missing some Tweet in this thread? You can try to force a refresh
 

Keep Current with Melissa Gymrek

Melissa Gymrek Profile picture

Stay in touch and get notified when new unrolls are available from this author!

Read all threads

This Thread may be Removed Anytime!

PDF

Twitter may remove this content at anytime! Save it as PDF for later use!

Try unrolling a thread yourself!

how to unroll video
  1. Follow @ThreadReaderApp to mention us!

  2. From a Twitter thread mention us with a keyword "unroll"
@threadreaderapp unroll

Practice here first or read more on our help page!

Did Thread Reader help you today?

Support us! We are indie developers!


This site is made by just two indie developers on a laptop doing marketing, support and development! Read more about the story.

Become a Premium Member ($3/month or $30/year) and get exclusive features!

Become Premium

Too expensive? Make a small donation by buying us coffee ($5) or help with server cost ($10)

Donate via Paypal Become our Patreon

Thank you for your support!

Follow Us on Twitter!