Thanks for the shout out, and welcome any new followers.
I like looking at GWAS and trying to decipher the causal biology behind the hits.
I use this account to highlight interesting results and provide links to the tools and approaches I find most useful.
One theme I come back to is that because the closest gene is usually the correct causal gene any analysis of a new GWAS should start there.
Here's a story from almost a year ago, a great study on heart trabeculae that initially ignored the closest genes
Fortunately this paper was on biorxiv. After I tweeted about the omissions and before it was published in Nature, the paper was amended to include 2 of the prominent heart structure genes: pubmed.ncbi.nlm.nih.gov/32814899/
Today @robo_gwas tweeted out a GWAS on mitochondrial abundance appearing in Human Genetics. This also went up first in biorxiv, but unfortunately I only discovered it today, so my suggestions will have to await some future publication
I love the idea of this GWAS. The authors estimated the abundance of mtDNA in the blood of @uk_biobank participants by using the intensities of probes mapping to the mito genome
@HaggSara
Juulia Jylhävä
Yunzhang Wang
Kamila Czene &
Felix Grassmann
I count 1,199 lead SNPs in this Manhattan plot! Nothing specifically special about leg fat free mass; this trait is highly correlated with other body size traits: ukbb-rg.hail.is/rg_summary_231…
About half the genes in the diagram (the ones with a 7) are also involved in closely related monogenic diseases. This is generally a reliable way to identify a true causal gene.
I looked across all the loci at all genes involved in "rare cardiac diseases" orpha.net/consor/cgi-bin…
First up are genes involved in depolarization and repolarization of the heart. These are all previously known loci, but fall into that nice category of closest gene and also rare disease gene that makes them highly likely to be causal (ok: SCN5A/SCN10A is a special case)
Here's how I see the SNP->gene gold standard issue.
This map separates the problem of identifying the causal transcript for a disease from the issue of identifying which transcripts are altered by a SNP.
As we know from, eg, lactase, many mRNAs are altered but only 1 is causal.
The map acknowledges that a GWAS association is probably acting through a functional variant that impacts a transcript that (usually) impacts a protein that may alter a biomarker or intermediate phenotype which manifests as a change in disease risk or complex phenotype.
From left to right:
cis-eQTLs and splicing-QTLs reveal mechanisms by which a DNA variant can impact mRNA abundance. It's good to model and predict these.
At a particular locus these may or may not translate into elucidation of the causal transcript for the disease phenotype.