Welcome all! We've added several hundred followers over the past few weeks, so as a quick intro, I use this account mainly to explore interesting issues related to the biological interpretation of GWAS @SbotGwa from @andganna provides me with a steady diet of interesting material
@SbotGwa alternates between GWAS from @uk_biobank and @FinnGen_FI.
Yesterday's Manhattan plot from FinnGen yielded a single hit for the trait "other and unspecified corneal deformities and disorders"
Let's dive in,
I often say it's good to take the most significant association at a locus as we start to interpret it. @FinnGen_FI uses a modified PheWeb server to show results.
Here is the PheWAS for this SNP.
Top association is Keratitis, inflammation of the cornea.
But actually strongest p is a poor surrogate for "biologically most important"
I made this complex plot to show the relationship between effect size, p-value and case numbers for the disease associations at this locus.
Dot size and numbers indicate case numbers which vary a lot
This relates to "clumping and splitting" of GWAS traits. The more narrow phenotype has a much larger effect size, but a weaker p-value because the smaller case counts reduces statistical power.
So what to make of "other and unspecified"?
Doesn't sound very specific...
According to the @GWASCatalog this locus has been associated with numerous eye traits including corneal thickness, intraocular pressure and macular thickness.
The lead SNP in this GWAS sits within an intron of COL4A3.
As a general rule, the gene closest to the lead SNP is usually the true causal gene, especially if the lead SNP is within the footprint of the gene.
So I'll wrap up here.
Not only can we conclude that COL4A3 is likely the causal gene at the locus, but also that the 330 "unspecified corneal deformities" likely represents a specific form of keratoconus.
(the image is collagen)
Surprisingly the @finngen GWAS of keratoconus is flat and shows no association at COL4A3 or anywhere else. Not really sure what that means. r4.finngen.fi/pheno/H7_CORNE…
• • •
Missing some Tweet in this thread? You can try to
force a refresh
Today's GWAS of urolithiasis, kidney stones and other stones of the urinary tract, provides a wonderful window into calcium, phosphate and vitamin D metabolism.
One nice thing about putting my GWAS interpretations here in Twitter is I can always quickly find what I may have written about a gene or a trait before.
Here's my write up on urolithiasis from 2 years ago in a completely different cohort, biobank japan
On the left is the top hits from @finngen; on the right the top hits from Biobank Japan.
5 of 6 loci from FinnGen also found in BBJ.
Note some of the lead SNPs may differ, but the causal genes line up.
Thanks for the shout out, and welcome any new followers.
I like looking at GWAS and trying to decipher the causal biology behind the hits.
I use this account to highlight interesting results and provide links to the tools and approaches I find most useful.
One theme I come back to is that because the closest gene is usually the correct causal gene any analysis of a new GWAS should start there.
Here's a story from almost a year ago, a great study on heart trabeculae that initially ignored the closest genes
Fortunately this paper was on biorxiv. After I tweeted about the omissions and before it was published in Nature, the paper was amended to include 2 of the prominent heart structure genes: pubmed.ncbi.nlm.nih.gov/32814899/
I love the idea of this GWAS. The authors estimated the abundance of mtDNA in the blood of @uk_biobank participants by using the intensities of probes mapping to the mito genome
@HaggSara
Juulia Jylhävä
Yunzhang Wang
Kamila Czene &
Felix Grassmann
I count 1,199 lead SNPs in this Manhattan plot! Nothing specifically special about leg fat free mass; this trait is highly correlated with other body size traits: ukbb-rg.hail.is/rg_summary_231…
About half the genes in the diagram (the ones with a 7) are also involved in closely related monogenic diseases. This is generally a reliable way to identify a true causal gene.
I looked across all the loci at all genes involved in "rare cardiac diseases" orpha.net/consor/cgi-bin…
First up are genes involved in depolarization and repolarization of the heart. These are all previously known loci, but fall into that nice category of closest gene and also rare disease gene that makes them highly likely to be causal (ok: SCN5A/SCN10A is a special case)
Here's how I see the SNP->gene gold standard issue.
This map separates the problem of identifying the causal transcript for a disease from the issue of identifying which transcripts are altered by a SNP.
As we know from, eg, lactase, many mRNAs are altered but only 1 is causal.
The map acknowledges that a GWAS association is probably acting through a functional variant that impacts a transcript that (usually) impacts a protein that may alter a biomarker or intermediate phenotype which manifests as a change in disease risk or complex phenotype.
From left to right:
cis-eQTLs and splicing-QTLs reveal mechanisms by which a DNA variant can impact mRNA abundance. It's good to model and predict these.
At a particular locus these may or may not translate into elucidation of the causal transcript for the disease phenotype.