Profile picture
Peter Kraft @GENES_PK
, 13 tweets, 15 min read Read on Twitter
@f2harrell @NPirastu @paulpharoah I worry three very distinct scientific goals for #GWAS and genetic association studies broadly are being conflated on this thread: (1) locus discovery, (2) causal variant discovery, and (3) disease prediction. Each requires different analysis strategies and interpretation. 1/n
@f2harrell @NPirastu @paulpharoah As to (1), as @tamar_sofer @paulpharoah and others have noted, the goal is just to find markers that are correlated with a causal variant (or variants). Nobody claims—or nobody should be claiming—that these markers are unique. 2/n
@f2harrell @NPirastu @paulpharoah @tamar_sofer So it’s not surprising if Study A reports SNP 1 at Locus X and Study B reports SNP 2. Nor is it particularly disturbing, if SNPs 1 & 2 are correlated. If you dig thru supplements you will probably find that SNP 2 has strong evidence for association in Study A and vice versa. 3/n
@f2harrell @NPirastu @paulpharoah @tamar_sofer Locus discovery is just a first step—we’re able to say that there’s most probably a causal variant (or variants) near the reported markers, but not too much more—but it’s an important first step. 4/n
@f2harrell @NPirastu @paulpharoah @tamar_sofer As to (2), causal variant discovery aka “fine mapping”—this is hard, for reasons @f2harrell alludes to and which are well known to genetic epidemiologists and statistical geneticists. 5/n
@f2harrell @NPirastu @paulpharoah @tamar_sofer This nice slide from @FHormozdiari illustrates why: the marginal p-values for SNPs strongly correlated with a causal variant are all strongly correlated. In any given study (modulo sample size) any one of these will have the smallest p-value. Most likely not the causal SNP. 6/n
@f2harrell @NPirastu @paulpharoah @tamar_sofer @FHormozdiari Standard feature selection algorithms like stepwise regression or lasso will not solve this problem, as they will still tend to pick one of the markers that happens to have stronger signal in the training data, and then keep the causal marker out of the model. 7/n
@f2harrell @NPirastu @paulpharoah @tamar_sofer @FHormozdiari This is well known and has been explored in a number of very nice papers (e.g. PMIDs 25357204 25104515). 8/n
@f2harrell @NPirastu @paulpharoah @tamar_sofer @FHormozdiari Recognizing the inappropriateness of marginal p-values and feature selection algorithms for “fine mapping,” statistical geneticists have turned to Bayesian approaches. (Very not exhaustive list: 25357204 25104515 26773131 27027514 23104008.) 9/n
@f2harrell @NPirastu @paulpharoah @tamar_sofer @FHormozdiari IMO the nice thing about this approach is it returns what we want, posterior probabilities that a variant is causal. And you can naturally incorporate functional annotations into priors. 10/n
@f2harrell @NPirastu @paulpharoah @tamar_sofer @FHormozdiari Of course fancy methods are not a panacea. Because of strong correlation at many loci we can’t resolve which are the causal variants—even in very large samples there are still dozens or hundreds that on the basis of statistical associations alone could be causal. 11/n
@f2harrell @NPirastu @paulpharoah @tamar_sofer @FHormozdiari As for (3) prediction modeling, I’m running out of steam so I’ll just say that gen epi has long recognized that feature selection for prediction is different than selection for discovery—and there are plenty of shrinkage methods that don’t require feature selection at all. 12/n
@f2harrell @NPirastu @paulpharoah @tamar_sofer @FHormozdiari To be clear, @f2harrell has raised important issues, which anybody new to genetic epidemiology will have to wrestle with, some of which the field is still wrestling with. But the field is not unaware of these issues, and there is already quite a body of work on them. 13/fin
Missing some Tweet in this thread?
You can try to force a refresh.

Like this thread? Get email updates or save it to PDF!

Subscribe to Peter Kraft
Profile picture

Get real-time email alerts when new unrolls are available from this author!

This content may be removed anytime!

Twitter may remove this content at anytime, convert it as a PDF, save and print for later use!

Try unrolling a thread yourself!

how to unroll video

1) Follow Thread Reader App on Twitter so you can easily mention us!

2) Go to a Twitter thread (series of Tweets by the same owner) and mention us with a keyword "unroll" @threadreaderapp unroll

You can practice here first or read more on our help page!

Did Thread Reader help you today?

Support us! We are indie developers!


This site is made by just three indie developers on a laptop doing marketing, support and development! Read more about the story.

Become a Premium Member and get exclusive features!

Premium member ($3.00/month or $30.00/year)

Too expensive? Make a small donation by buying us coffee ($5) or help with server cost ($10)

Donate via Paypal Become our Patreon

Thank you for your support!