I’ve been working with an IVF startup, @herasight, that has already screened hundreds of embryos. Today we come out of stealth with a paper showing that our predictors for 17 diseases — validated within-family — beat the competition, with improved performance in non-Europeans🧵
Check out our website where you can play with our widget and download our white paper: herasight.com
In our paper, we detail our polygenic scores (PGS) for 17 diseases using a custom meta-analysis. We used state-of-the-art methods to create PGSs based on 7.3M SNPs. Our most predictive PGSs explained ~20% of the variance in liability for prostate cancer and type-II diabetes.
We compared our PGSs to the best published academic efforts. Our scores explained 28% and 87% more liability variance on average than those of Thompson et al. and Mars et al., respectively, with substantial improvements for breast cancer, melanoma, coronary artery disease, gout, and multiple sclerosis.
We compared our scores to published data from embryo screening companies Orchid and Genomic Prediction. Our scores explained 122% (Orchid) and 193% (GP) more liability variance, translating to greater absolute and relative risk reductions (table below).
For example, for a European ancestry couple with one parent affected by type-II diabetes, our PGS would give an expected absolute risk reduction of 12%, double the 6% expected from Orchid and GP.
We found a relative risk calculator on GP's website (lifeview.com/ehs) that appears to give different numbers to those in their published validation paper. On closer inspection, they state these results are from "...selecting among unrelated individuals of European descent", which does not account for the reduced genetic variation within-family. Thus, these numbers give inflated estimates of what could be achieved through embryo screening using their scores.
Orchid provided validations for only 6 out of the 12 traits offered, and we were unable to determine appropriate confidence intervals for their reported performance metrics.
If the prediction ability of a PGS is smaller within-family — as I’ve shown for some traits in my own research — it won’t work as well for embryo screening. We therefore validated our PGS within-family, with only osteoporosis showing a significant reduction of its effect.
PGS being offered for embryo screening should undergo within-family validation. While our analysis shows that most disease PGS predict about as well within-family as in the population, osteoporosis shows that this can't be assumed and should be checked.
If the performance is lower within-family, the PGS can still be used for embryo screening, but the lower predictive ability needs to be accounted for when calibrating risk predictions.
PGS prediction accuracy declines with genetic distance from the training sample. We assessed this for our PGS to calibrate our predictions for different ancestries, finding improved prediction ability for non-Europeans compared to previously published results (Prive et al.).
The improvements are likely due to applying SBayesRC to 7.3M SNPs with functional annotations, enabling some degree of fine-mapping of causal variants shared across ancestries.
We use type-II diabetes to illustrate the utility of our scores for embryo screening. Even with 5 embryos, the absolute risk reduction ranges from 5% to 15% depending on parental ancestries and disease status, with relative risk reductions of ~50% expected in some scenarios — e.g. two EUR ancestry parents without type-II diabetes and 5 embryos to screen.
Thanks to @_twolfram for leading this effort and the rest of the team (@SponceyM @jeremyli__ @JonathanAnomaly) at @herasight! Check out our website () and sign up to keep informed. Even more exciting science is coming soon!herasight.com
At @herasight, we wanted to compare our genetic predictors (PGS) to those from @nucleusgenomics. However, in many cases, we couldn’t reconcile plausible performance of their PGSs with customer risk reports we saw — this may have misled customers about their disease risks.
Nine of their PGSs appeared to be open-source models from PGS catalog. Many (see table) relied on small numbers of variants despite being for polygenic diseases. State of the art PGS typically use thousands or millions of variants to maximize predictive ability.
The table gives our liability scale R^2 (a measure of PGS prediction performance) back-engineered from customer reports, along with the number of variants used in the PGS, and the R^2 we achieved in our independent validation in UKBB.
We back-engineered the implied liability R^2 (a measure of PGS prediction ability) from Nucleus’ customer reports. We compared this to our own validation in UKBB using the SNPs/PGS Nucleus claims to be using, and we found much lower prediction accuracy in almost all cases.
New educational attainment (EA) GWAS out today: nature.com/articles/s4158…. We expand the sample size from ~1 million to ~3 million, making it one of the largest GWAS to date. We identify more loci affecting EA and increase our ability to predict EA from genetic data, and more...🧵
First, some background: educational attainment (EA) measures the number of years of education an individual completes, information that is routinely collected by researchers in many fields, and is strongly related to many socioeconomic and health outcomes.
Twin studies estimate that the heritability of EA is around 40%, i.e. 40% of the variation in EA between individuals is due to genetic differences. Twin studies also estimate a similarly important role for the family environment: i.e. both genetics and environment are important.
New from Richard Border @andywdahl @flint Noah Zaitlen myself & others. Estimated genetic correlations between traits may be inflated by cross-trait assortative mating (xAM): . So are apparent genetic relationships between traits a statistical artefact? 🧵biorxiv.org/content/10.110…
The genetic correlation between two traits is usually interpreted as a measure of the degree to which the genetic effects on one trait are correlated with the genetic effects on the other, i.e. shared underlying biology. However, this interpretation is not valid when there is xAM
xAM means phenotype 1 in the mother is correlated with phenotype 2 in the father; for example, the education of the father could be correlated with height of the mother. In this case, genetic variants affecting height become correlated with genetic variants affecting education