Fantastic talk by @kapoormanav on the genetic architecture of South Asian Indian population based on analysis of more than 15,000 exomes from Rajasthan and Maharashtra states of India. #ASHG22
India is one of the most diverse country in the world with 22 official spoken languages, ~3000 castes, 25,000 sub castes and a population size of 1.4 billion. And yet India is one of the most underrepresented countries in the genetic studies.
The extreme diversity in terms language, caste and culture makes Indian population so unique as these factors have sculpted the genetic architecture of Indians over hundreds to thousands of years.
This fascinating fact can be readily appreciated when we cluster the Indian individuals based on their genetic data, which Manav showed in beautiful UMAP plots based on first 10 PCs.
Individuals cluster together based on language they speak more than where they live. Within each language clusters, individuals cluster based on their caste and sub-caste.
This is because individuals marry only within a caste and those who marry outside the caste become social outcast and sometimes face grave consequences.
Such cultural barriers to gene flow gave birth to thousands of "isolated" populations, similar to geographical barriers resulting in isolated populations (e.g. Northern Isles)
Note, sociocultural barriers are more powerful than geographical barriers to gene flow. This results in founder effects that are stronger than what we've seen in well studied founder populations like Finns and Ashkenazi Jews
In the current larger dataset, Manav was able recapitulate the strong founder effects in terms of high within community IBD scores for many caste communities.
Such strong founder effects seem to have caused fascinating consequences on the admixture proportions of the present day Indians.
Manav modeled the admixture of the study participants based on three major ancestries--ancestral South Indians (ASI, Indo-Dravidians), ancestral north Indians (ANI, Indo-Aryans) and Europeans (EUR)
Moving from North to South axis, as expected many communities showed linear increase in ASI and decrease in ANI and EUR. But some communities stood out showing predominantly ANI (Jain) or predominantly ASI (Koli)
This suggests that such communities have experienced strong barriers to gene flow due to strong endogamy practices, which was also reflected in the high IBD scores in those communities. That was fascinating!
Another area that has been poorly explored in Indian context is the positive selection. Humans who settled over this large country many thousand years ago over multiple migrations for sure faced many challenges like food, pathogen etc. and underwent adaptation.
We know little about those such positive selections. Analysing the genetic data of the 15k samples, Manav identified 24 loci across the genome with strong signs of positive selection. Some are known, e.g. LCT, HLA etc.
Some are novel. Manav highlights a locus in chr 4 that contains gene OTOP1 that encoded a zinc sensitive channel protein that plays an important role in inner ear balance. Variants within this positively selected region strongly associate with vertigo.
As a South Asian Indian who has suffered from motion sickness since childhood and diagnosed with vertigo in adulthood, I am particularly fascinated by this finding.
Next, the most important topic: human knockouts. Similar to Pakistan, consanguineous marriages are common in India, particularly within certain caste communities and this result in high frequency of homozygous loss of function mutations.
Manav showed that the 15k Indian samples had 5 times more number of homozygous pLOFs compared to sample size matched European dataset. Many of these pLOFs were never seen before.
Manav highlighted a splice donor variant in PLA2G7 that encodes an enzyme that degrades platelet activating factor. this variant is seen 0.5-1% of the Indians but absent elsewhere in the world.
This variant increases the risk for cardiovascular disease. Individuals carrying this variant are at more than five times odds of having undergone coronary bypass surgery. It seems like there might be hundreds of thousands human knockouts for PLA2G7 roaming around in India.
The initial results that Manav presented look amazing and with increasing sample sizes these results will get even better. Watch this space for future updates.
I noticed that some good geneticists whom I respect made insensitive comments on our work. As an Indian who has lived and seen bad side of the caste system, I want to emphasize strongly that caste is extremely important factor when it comes to genetics in India.
This knowledge is important for both making genetic discoveries, to diagnose diseases and most importantly, for educating the future generations on the bad effects of endogamous practices.

• • •

Missing some Tweet in this thread? You can try to force a refresh
 

Keep Current with Veera M. Rajagopal

Veera M. Rajagopal Profile picture

Stay in touch and get notified when new unrolls are available from this author!

Read all threads

This Thread may be Removed Anytime!

PDF

Twitter may remove this content at anytime! Save it as PDF for later use!

Try unrolling a thread yourself!

how to unroll video
  1. Follow @ThreadReaderApp to mention us!

  2. From a Twitter thread mention us with a keyword "unroll"
@threadreaderapp unroll

Practice here first or read more on our help page!

More from @doctorveera

Oct 25
Danish Saleheen stunned the audience with his story of building the world's largest cohort of human knockouts in Pakistan, which is the world's 5th most populous country with highest level of consanguinity ever known. #ASHG22 #DRIFT22 Image
Starting with around 10k individuals sequenced in 2017, now the cohort comprise of around 200k individuals recruited, of which 80,000 were exome sequenced. Goal is to sequence 1 Million.
nature.com/articles/natur…
Based on these data, they have identified so far >14000 human knockouts for >5000 genes. To achieve the same in European populations, you'll have to sequence >11 million individuals. Image
Read 14 tweets
Oct 25
Fantastic talk by @cdbustamante about an ambitious initiative to set up one of the world's largest diverse biobank with a focus on Latin American populations with a target sample size of 10 Million(!!). Welcome to the biobank of the Americas! #ASHG22
bbofa.org
The motivation behind this massive undertaking is something that is obvious to the field: massive underrepresentation of Non-European populations in the genetic databases.
The participants recruitment is targeted mainly in the Latin America and Caribbean diaspora that holds ~8% of the world population and comprise 700+ ethnic groups and yet represent <1% of the global genetic databases.
Read 7 tweets
Oct 9
Stumbled upon this wonderful article today, which highlights the value of human genetics in studying drug side effects. A 🧵 for your Sunday read.
ahajournals.org/doi/10.1161/JA… Image
Statins, commonly prescribed cholesterol medications, have a clear mechanism of action: they inhibit HMG-CoA reductase enzyme in liver, leading to up-regulation of LDL receptors and reduction in circulating LDL cholesterol.
app.pulsenotes.com/clinical/pharm… Image
Statin is a lifelong medication and one of the suspected adverse effects of long term use of statins is cataract based on animal models and observational studies.
Read 11 tweets
Oct 7
Functional study of six de novo missense variants in KCKN3 identified in 9 probands from the DDD cohort highlights the importance of a K+ channel in the pathophysiology of sleep apnea. Amazing discovery 🧵
@ProfKalium @carolinefwright @mehurles
nature.com/articles/s4158…
This is a great example and a reminder that our biological understanding of many of the common diseases (in this case, sleep apnea) come from rare monogenic disorders.
In this study, the authors report a new developmental disorder associated with sleep apnea caused by de novo gain of function missense mutations in a K+ channel coded by gene KCNK3.
Read 18 tweets
Oct 2
Certain GWAS findings, though not striking themselves, often open doors to literature that are fascinating and enlightening. Let's talk about factor 11, a coagulation factor, which is one of the hits in the recent stroke GWAS. 🧵
Gene F11 codes for coagulation factor X1 that participates in the intrinsic pathway of coagulation cascade, a series of enzymatic activation of coagulation proteins that results in the formation of blood clot.
ncbi.nlm.nih.gov/books/NBK48225…
Hemophilia is a group of hereditary bleeding disorders caused by coagulation factors deficiency. Hemophilia, particularly type B, the poster child for X-linked recessive inheritance, is historically famous due to its existence in European royalty.
en.wikipedia.org/wiki/Haemophil…
Read 25 tweets
Sep 30
A GWAS of mucin proteins (MUC5AC, MUC5B) concentration in the sputum sample of COPD patients and controls identifies an interesting finding.
biorxiv.org/content/10.110… Image
The authors identified a strong cis pQTL, rs140324259, that decreases the MUC5B concentration in the sputum with an effect that is even stronger than the effect of smoking on sputum.
Phenotypically MUC5B concentrations are higher in COPD cases with acute exacerbation and one would assume that that the pQTL that strongly reduces MUC5B concentrations would be protective.
Read 8 tweets

Did Thread Reader help you today?

Support us! We are indie developers!


This site is made by just two indie developers on a laptop doing marketing, support and development! Read more about the story.

Become a Premium Member ($3/month or $30/year) and get exclusive features!

Become Premium

Don't want to be a Premium member but still want to support us?

Make a small donation by buying us coffee ($5) or help with server cost ($10)

Donate via Paypal

Or Donate anonymously using crypto!

Ethereum

0xfe58350B80634f60Fa6Dc149a72b4DFbc17D341E copy

Bitcoin

3ATGMxNzCUFzxpMCHL5sWSt4DVtS8UqXpi copy

Thank you for your support!

Follow Us on Twitter!

:(