I count 1,199 lead SNPs in this Manhattan plot! Nothing specifically special about leg fat free mass; this trait is highly correlated with other body size traits: ukbb-rg.hail.is/rg_summary_231…
When faced with so many loci one analysis is to look for enrichment among the closest genes.
Here I asked which "pathway" shows the highest number of genes in this GWAS.
The answer is Biocarta "cell cycle"; of 23 cell cycle 9 are closest, 1 more is 2nd closest:
These are the 10 cell cycle genes that are also leg fat free mass candidate genes - cyclins, cyclin dependent kinases and the CDK phosphatase CDC25A: en.wikipedia.org/wiki/CDC25A
I have some familiarity with CDC25A as I solved the crystal structure in 1998, the first phosphatase with this specific fold, convergent evolution to the cysteine-X-X-X-X-X-arginine motif. pubmed.ncbi.nlm.nih.gov/9604936/
I haven't found a lot of literature linking the genetics of human body size to regulation of the cell cycle. This may be the closest, exploring cdkn1b (one of the genes listed up above) to body size in mice
One interesting observation, most of these cell cycle leg mass genes have a significant hit for height; often height is the most sig association.
While height is clearly correlated to leg mass, these genes may skew more toward height than the GWAS overall.
So this is interesting to me. Regulation of cell cycle influences cellular proliferation and then height or body size.
Clearly to be bigger you need more cells, but I haven't seen this explicit link in the GWAS before.
What do you think?
• • •
Missing some Tweet in this thread? You can try to
force a refresh
About half the genes in the diagram (the ones with a 7) are also involved in closely related monogenic diseases. This is generally a reliable way to identify a true causal gene.
I looked across all the loci at all genes involved in "rare cardiac diseases" orpha.net/consor/cgi-bin…
First up are genes involved in depolarization and repolarization of the heart. These are all previously known loci, but fall into that nice category of closest gene and also rare disease gene that makes them highly likely to be causal (ok: SCN5A/SCN10A is a special case)
Here's how I see the SNP->gene gold standard issue.
This map separates the problem of identifying the causal transcript for a disease from the issue of identifying which transcripts are altered by a SNP.
As we know from, eg, lactase, many mRNAs are altered but only 1 is causal.
The map acknowledges that a GWAS association is probably acting through a functional variant that impacts a transcript that (usually) impacts a protein that may alter a biomarker or intermediate phenotype which manifests as a change in disease risk or complex phenotype.
From left to right:
cis-eQTLs and splicing-QTLs reveal mechanisms by which a DNA variant can impact mRNA abundance. It's good to model and predict these.
At a particular locus these may or may not translate into elucidation of the causal transcript for the disease phenotype.