Eric Fauman Profile picture
GWAS whisperer 'Nothing in biology makes sense except in the light of evolution' -Dobzhansky DMs open Executive Director, Integrative Biology @Pfizer
Sep 3, 2022 7 tweets 5 min read
Happy to have been a part of this METSIM and @FinnGen_FI effort combining metabolomics, transcriptomics and disease traits.
Among many other results we see again lessons for appropriate and inappropriate ways to interpret eQTLs.
pubmed.ncbi.nlm.nih.gov/36055244/ Image We confirm (again) that you cannot use eQTLs to identify, select or prioritize the true causal gene. As in 2020 paper by Ndungu & @markmccarthyoxf we find an 8% precision using TWAS alone
On average TWAS will flag 11 wrong genes for every 1 correct gene. pubmed.ncbi.nlm.nih.gov/31978332/ Image
May 8, 2022 17 tweets 11 min read
Why you can't use eQTLs to interpret GWAS hits:

A story in 2 parts I like to peruse @gwascatalog for novel causal genes, especially for metabolite GWAS.
This paper was just added to the @gwascatalog.

Among other things it describes a GWAS for circulating copper levels.

(copper counts as a metabolite, right?)

ncbi.nlm.nih.gov/pmc/articles/P…
Mar 28, 2022 15 tweets 9 min read
Another fantastic gene story from the METSIM metabolomics GWAS, now available in Nature Communications
rdcu.be/cJYVY Image The trait is “carotene diol”. The @Metabolon platform identifies 3 unique metabolites, but the GWAS reveals some consistent signals across these 3 molecules
pheweb.org/metsim-metab/p… Image
Mar 8, 2022 12 tweets 6 min read
Folks who follow me on Twitter will have seen bits of this before, but with the help of my @pfizer colleague Craig Hyde we have now provided some mathematical structure to my observations about distances from GWAS lead SNPs to causal genes: biorxiv.org/content/10.110… We started with the recent pQTL study from @pietznerm et al: pubmed.ncbi.nlm.nih.gov/34648354/
It is well known that the distance from lead SNP to cognate gene follows an approximate exponential decay: Image
Jan 6, 2022 6 tweets 4 min read
While it is true that the gene closest to a GWAS peak is not always the causal gene, it is also true that it usually is.
In fact, we can quantify how often we should expect the causal gene to be the closest gene, and that number is about 70%
3 papers from 2021 help pin this down: Activity-by-contact (ABC-Max) predicts a causal gene for a GWAS SNP using a combination of cell-type specific chromatin accessibility, epigenome marks and chromatin conformation, which can also be estimated by SNP-TSS distance: pubmed.ncbi.nlm.nih.gov/33828297/
Oct 22, 2021 9 tweets 4 min read
Oct 16, 2021 5 tweets 3 min read
Having this enormous collection of pQTLs allows us to answer the question (again):

Which is more relevant:

Distance of a GWAS SNP to the TSS (transcription start site) or to the gene body of a candidate gene?
pubmed.ncbi.nlm.nih.gov/34648354/ Usually you get the same closest gene measuring to TSS or to gene body.

But in the top case a pQTL for ACAA1 sits inside an irrelevant gene but is closer to the TSS for ACAA1.

But a pQTL for DNAJC17 sits closer to a TSS for a random gene despite sitting within DNAJC17.
Oct 15, 2021 5 tweets 2 min read
When protein abundance is the trait, the simplest assumption is that the gene encoding the protein is the causal gene.
This catalog of 10,674 pQTLs from @pietznerm et al provides a rare unbiased look at GWAS SNP->causal gene genomic properties. I took a quick look at the SNP-gene distances for all cases where the lead SNP had an rsID and the trait had a unique HGNC gene symbol. 3,475 cases SNP and cognate gene are on the same chromosome, 2,985 times within 500kb, with a very strong distance dependence. Image
Oct 15, 2021 7 tweets 3 min read
In mapping SNPs to genes we clearly can do better than taking the closest gene, but that should be the baseline by which we compare other methods.
@cr_farber et al, I hope you'll consider this before submitting this for publication. In this preprint the authors started with 1,097 lead SNPs for bone mineral density from pubmed.ncbi.nlm.nih.gov/30598549/ and applied TWAS and eQTL colocalization to identify "potentially causal genes"
Mar 27, 2021 12 tweets 6 min read
A well-behaved GWAS yields strong signals for the kinds of genes that contribute to the phenotypic variation.
This provides strong priors for discerning likely causal genes hidden at other loci.

With this in mind, let's revisit the telomere GWAS

medrxiv.org/content/10.110… Who's that causal gene. A silhouette stands near a chromosom Just going by closest gene, many telomere biology related themes emerge
Jan 31, 2021 6 tweets 4 min read
Today's GWAS of urolithiasis, kidney stones and other stones of the urinary tract, provides a wonderful window into calcium, phosphate and vitamin D metabolism. One nice thing about putting my GWAS interpretations here in Twitter is I can always quickly find what I may have written about a gene or a trait before.

Here's my write up on urolithiasis from 2 years ago in a completely different cohort, biobank japan

Jan 30, 2021 11 tweets 7 min read
Welcome all! We've added several hundred followers over the past few weeks, so as a quick intro, I use this account mainly to explore interesting issues related to the biological interpretation of GWAS
@SbotGwa from @andganna provides me with a steady diet of interesting material @SbotGwa alternates between GWAS from @uk_biobank and @FinnGen_FI.
Yesterday's Manhattan plot from FinnGen yielded a single hit for the trait "other and unspecified corneal deformities and disorders"
Let's dive in,
Jan 2, 2021 5 tweets 3 min read
Thanks for the shout out, and welcome any new followers.

I like looking at GWAS and trying to decipher the causal biology behind the hits.
I use this account to highlight interesting results and provide links to the tools and approaches I find most useful. One theme I come back to is that because the closest gene is usually the correct causal gene any analysis of a new GWAS should start there.
Here's a story from almost a year ago, a great study on heart trabeculae that initially ignored the closest genes

Jan 2, 2021 18 tweets 8 min read
Image I love the idea of this GWAS. The authors estimated the abundance of mtDNA in the blood of @uk_biobank participants by using the intensities of probes mapping to the mito genome

@HaggSara
Juulia Jylhävä
Yunzhang Wang
Kamila Czene & 
Felix Grassmann

pubmed.ncbi.nlm.nih.gov/33385171/
Dec 24, 2020 9 tweets 5 min read
So this is fun - what connects these images: Well the image on the right is supposed to represent yesterday's @SbotGwa @uk_biobank Manhattan plot for "Leg Fat Free Mass (Left):
May 24, 2020 8 tweets 4 min read
I do love this figure from the new PR interval paper.

Interestingly 16/25 highlighted genes are in fact the closest gene to the lead SNP.

64%, very similar to eQTLs, pQTLs and metabolite QTLs. About half the genes in the diagram (the ones with a 7) are also involved in closely related monogenic diseases. This is generally a reliable way to identify a true causal gene.

I looked across all the loci at all genes involved in "rare cardiac diseases"
orpha.net/consor/cgi-bin…
Feb 19, 2020 10 tweets 4 min read
Here's how I see the SNP->gene gold standard issue.

This map separates the problem of identifying the causal transcript for a disease from the issue of identifying which transcripts are altered by a SNP.

As we know from, eg, lactase, many mRNAs are altered but only 1 is causal. The map acknowledges that a GWAS association is probably acting through a functional variant that impacts a transcript that (usually) impacts a protein that may alter a biomarker or intermediate phenotype which manifests as a change in disease risk or complex phenotype.