My Authors
Read all threads
Excited to share our latest article published in
@PLOSGenetics : journals.plos.org/plosgenetics/a…. This is joint work with @Greenwood_LDI @TianyuanLu1 and others. A brief thread (1/n): Image
This work addresses a recurring challenge in the analysis and interpretation of genetic association studies: which genetic variants can best predict and are independently associated with a given phenotype in the presence of population structure? (2/n)
Not controlling confounding due to population structure, family and/or cryptic relatedness can lead to spurious associations. Many methods have focused on modeling the association between a phenotype and a single variant in a linear mixed model (LMM) with a random effect. (3/n)
Here we propose ggmix. An alternative method for fitting high-dimensional multivariable models, which selects SNPs that are independently associated with the phenotype while also accounting for population structure. (4/n)
We apply ggmix to predict height (a highly polygenic trait) in the @uk_biobank and show that it produces sparser models with better predictive accuracy compared to the lasso with a PC adjustment and a Bayesian LMM. (5/n) Image
In a mouse crosses example, we applied ggmix to find loci associated with mouse sensitivity to mycobacterial infection and showed our method is robust to perturbations in a bootstrap analysis. Our re-analysis of the data also lead to some potentially new findings. (6/n) Image
Many methods for single-SNP analyses construct the kinship matrix using all chromosomes except the one on which the marker being tested is located. This approach is not possible if we want to model many SNPs
(across many chromosomes) jointly to create a polygenic risk score.(8/n)
Our simulation study compares the performance of different methods in terms of variable selection and prediction error when the causal variants are included in the calculation of the kinship matrix. (9/n)
Overall, we observed that variable selection results and prediction error for ggmix were similar
regardless of whether the causal SNPs were in the kinship matrix or not. (10/n)
This result is encouraging since in practice the kinship matrix is constructed from a random sample of SNPs across the genome, some of which are likely to be causal, particularly in polygenic traits. (11/n)
Simulations show the principal component adjustment (lasso+PC) method may not be the best approach to control for confounding by population structure, particularly when variable selection is of interest. We believe more work is needed to investigate this further. (12/n)
Our method is freely available in the ggmix #rstat package on CRAN cran.r-project.org/package=ggmix with extensive documentation available at sahirbhatnagar.com/ggmix/. (end of thread)
Missing some Tweet in this thread? You can try to force a refresh.

Keep Current with Sahir Rai Bhatnagar

Profile picture

Stay in touch and get notified when new unrolls are available from this author!

Read all threads

This Thread may be Removed Anytime!

Twitter may remove this content at anytime, convert it as a PDF, save and print for later use!

Try unrolling a thread yourself!

how to unroll video

1) Follow Thread Reader App on Twitter so you can easily mention us!

2) Go to a Twitter thread (series of Tweets by the same owner) and mention us with a keyword "unroll" @threadreaderapp unroll

You can practice here first or read more on our help page!

Follow Us on Twitter!

Did Thread Reader help you today?

Support us! We are indie developers!


This site is made by just two indie developers on a laptop doing marketing, support and development! Read more about the story.

Become a Premium Member ($3.00/month or $30.00/year) and get exclusive features!

Become Premium

Too expensive? Make a small donation by buying us coffee ($5) or help with server cost ($10)

Donate via Paypal Become our Patreon

Thank you for your support!