My Authors
Read all threads
Here's how I see the SNP->gene gold standard issue.

This map separates the problem of identifying the causal transcript for a disease from the issue of identifying which transcripts are altered by a SNP.

As we know from, eg, lactase, many mRNAs are altered but only 1 is causal.
The map acknowledges that a GWAS association is probably acting through a functional variant that impacts a transcript that (usually) impacts a protein that may alter a biomarker or intermediate phenotype which manifests as a change in disease risk or complex phenotype.
From left to right:
cis-eQTLs and splicing-QTLs reveal mechanisms by which a DNA variant can impact mRNA abundance. It's good to model and predict these.
At a particular locus these may or may not translate into elucidation of the causal transcript for the disease phenotype.
I feel cis-pQTLs are an underappreciated source of gold-standard causal transcripts. It just requires the simple assumption that a DNA variant affecting protein levels within (say) 1 Mb of the gene for that protein is acting through that gene. We applied this in our ProGem paper.
One well-established source of gold standard genes is when a Mendelian disease gene occurs close to a SNP for a related disease.
One example is the glucokinase gene (GCK) which sits near a diabetes SNP and is also a MODY gene
type2diabetesgenetics.org/gene/geneInfo/…
omim.org/entry/138079
The "Exome-validation" approach assumes a gene harboring a rare coding mutation leading to an extreme phenotype is causal for common variants leading to modest variation in that same phenotype. E.g.: rs995000 for LDL-C is probably acting through ANGPTL3
nature.com/articles/nrg.2…
Another common approach assumes a gene encoding a known drug target for treating a disease or symptom is likely to be causal for that same disease or symptom. A classic example is rs12916 with an association for hypercholesterolemia acting through HMGCR
genetics.opentargets.org/variant/5_7536…
My favorite source for gold standard SNP->genes is metabolite GWAS. I've highlighted many examples here in twitter. Here's just one of my favorites. These metabolites are only produced by this gene. This gene only produces these metabolites.
I have collected 430 examples to date.
My last bucket: experimental validation has thoroughly mapped the full information flow from variant to causal transcript to impact on disease endpoint
Eg: rs1558902 for BMI acting through IRX3/IRX5, Claussnitzer, 2015
ii) rs7528419 for LDL-C acting through SORT1, Musunuri, 2011
So what do you think?

Are there categories of "gold standard SNP->gene" examples I've missed?

Do you disagree with any of the buckets I've presented here?
Missing some Tweet in this thread? You can try to force a refresh.

Enjoying this thread?

Keep Current with Eric Fauman

Profile picture

Stay in touch and get notified when new unrolls are available from this author!

Read all threads

This Thread may be Removed Anytime!

Twitter may remove this content at anytime, convert it as a PDF, save and print for later use!

Try unrolling a thread yourself!

how to unroll video

1) Follow Thread Reader App on Twitter so you can easily mention us!

2) Go to a Twitter thread (series of Tweets by the same owner) and mention us with a keyword "unroll" @threadreaderapp unroll

You can practice here first or read more on our help page!

Follow Us on Twitter!

Did Thread Reader help you today?

Support us! We are indie developers!


This site is made by just three indie developers on a laptop doing marketing, support and development! Read more about the story.

Become a Premium Member ($3.00/month or $30.00/year) and get exclusive features!

Become Premium

Too expensive? Make a small donation by buying us coffee ($5) or help with server cost ($10)

Donate via Paypal Become our Patreon

Thank you for your support!