Sasha Gusev Profile picture
Mar 1 16 tweets 6 min read Read on X
Some thoughts on the ability to distinguish populations with genetic variation, why that means little for trait differences, and why there are other good reasons to collect diverse data. 🧵
I was pleasantly surprised to see no one mount a strong defense of "biological race" in this thread. Even the people throwing this term around seem to realize it's not supported by data. Instead the conversation shifts to population "distinguishability".

For example, a random twitterer (left) and a professor (right) emphasizing that genetic variation can be used to "distinguish" populations. And it's true, one can aggregate small per-variant differences into genetic ancestry estimates that often correlate highly with geography.
Image
Image
Moreover, it's true at essentially all scales: ancestry inference in self-reported whites in the US correlates w/ European country references; ancestry inference in self-reported "White British" in the UK correlates with latitude/longitude and counties in the UK; and so on. Image
In fact we know, given enough sites, a method like PCA can identify correlations down to a handful of generations (or even pick up families). Of course, no one argues counties/zipcodes are biological units, so "distinguishability" alone is not meaningful. What is meaningful? Image
We might be interested in meaningful distinguishability of genetically driven traits. But unlike genetic ancestry, a neutral trait does NOT become more differentiated as you aggregate more variants. So distinguishable ancestry NEED NOT translate into trait differences. Image
We even have bounds: the expected between-population neutral trait variance is Fst * heritability. For human populations and traits this is very low (1-8%) even if we take genetic ancestry extremes, and of course these differences are centered at zero and go in either direction. Image
We might be interested in individual large-effect variants with big frequency differences due to bottlenecks (like BRCA) or selective sweeps (like pigment or lactase). In the early genome days there was great speculation that "divergent genes" would explain trait disparities.
Such studies have been run and, as it turns out, "hard sweeps" are very infrequent. This is broadly appreciated in the field but draws intense backlash on twitter, so I'll just quote some sources [ , ] and save the details for later. web.stanford.edu/group/pritchar…
nap.nationalacademies.org/catalog/26902/…

Image
Image
Lastly, perhaps the causal effects of common variants differ substantially between populations (for example due to interactions). Though more work is needed, studies using local ancestry show this does not generally appear to be the case. Details here:
In short, "distinguishable" ancestry in PCA tells us nothing about traits, either neutral trait means, hard locus-specific selection, or genome-wide effect sizes. So why do we collect diverse data? IMO three good reasons and none of them have to do with trait divergence:
1: Diverse populations are likely to have more diverse *environments*, which (we hope) is useful for understanding the relationships between genetic variation and context [ex: ], as well as enriching for more environmental risk factors.ncbi.nlm.nih.gov/pmc/articles/P…
2: Association studies estimate effects from "tag" SNPs + noise due to LD and frequency. The noise is further amplified across populations leading to poor prediction. Diverse data can improve prediction and increase sensitivity by cleaning up this noise.
3: Diverse data picks up a few more rare variants (especially non-singletons). These contribute very little to group-specific trait differences, but they can improve imputation, identify novel biology, &be important to their carriers (ex: drug reactions).
TLDR: "distinguishability" is mostly a matter of having enough data points in the analysis. We collect diverse populations not because we expect much trait divergence, but to capture environments, better tag SNPs/LD, and variants in a more useful frequency range. /fin
@threadreaderapp unroll

• • •

Missing some Tweet in this thread? You can try to force a refresh
 

Keep Current with Sasha Gusev

Sasha Gusev Profile picture

Stay in touch and get notified when new unrolls are available from this author!

Read all threads

This Thread may be Removed Anytime!

PDF

Twitter may remove this content at anytime! Save it as PDF for later use!

Try unrolling a thread yourself!

how to unroll video
  1. Follow @ThreadReaderApp to mention us!

  2. From a Twitter thread mention us with a keyword "unroll"
@threadreaderapp unroll

Practice here first or read more on our help page!

More from @SashaGusevPosts

Feb 27
Something I don't want to get lost is that the field is much better now at studying, visualizing, and discussing complex populations than it has ever been, and there are many resources to help do this effectively. A few suggestions below:
The NAES report and interactive on using population descriptors [] and Coop on genetic similarity [].

Carlson et al. [] and Lewis et al. [] on accurate presentation of ancestry.nap.nationalacademies.org/resource/26902…
arxiv.org/abs/2207.11595
nature.com/articles/d4158…
pubmed.ncbi.nlm.nih.gov/35420968/
Borrell et al. [] on race/ancestry in medicine.

Lawson et al. on understanding STRUCTURE []. McVean on understanding PCA [].nejm.org/doi/full/10.10…
nature.com/articles/s4146…
journals.plos.org/plosgenetics/a…
Read 5 tweets
Feb 21
I've written about race, genetic ancestry, analyses of large biobanks, and human history



I'll summarize the key points here 🧵: gusevlab.org/projects/hsq/#…
Image
Let's define some terms. Race is a social categorization of people into groups, typically based on physical attributes. Genetic ancestry is a quantification of genetic similarity to a reference population. While correlated, they have fundamentally different causes & consequences. Image
We should care about causes, and race is a poor causal model of human evolution. In truth, genetic variation follows a "nested subsets" model, where all people eventually share ancestors, which is fundamentally different from race (see for yourself here: ). james-kitchens.com/blog/visualizi…
Image
Read 20 tweets
Feb 2
I’ve seen quotes from David Reich’s “Who We Are and How We Got Here” passed around with the insinuation that it is secretly supportive of racist and hereditarian theories, even though it directly criticizes such views. It's worth looking at what Reich actually wrote: 🧵
Reich writes at length about Nick Wade's book 'A Troublesome Inheritance', a distillation of the hereditarian position. He makes clear that Wade misleads "naive readers" into a position that has "no merit": that genetic differences correspond to traditional racial stereotypes. Image
Reich calls out an essay by Cochran, Hardy, and Harpending that claims Jewish intelligence is the product of natural selection, which is contradicted by evidence that disease-causing mutations in Ashkenazi Jews are simply a consequence of population bottlenecks and bad luck. Image
Read 16 tweets
Jan 29
So this is pretty typical of the low-information content you get from the genetic racists. The majority of this post is just blather but there is one (1) specific claim about genetics: that the molecular genetic contribution to IQ keeps going up every year. This is false. A 🧵:
Image
The first study in 2011 into the heritability of IQ using molecular genetic methods found moderately high estimates 40-51%. But this approach was flawed technically (estimator bounds and population structure) and conceptually (environmental confounding). Image
Fast forward to 2023, using hundreds of thousands of people from the UK Biobank, Williams et al. [] ran a battery of analyses to refine a high-quality IQ estimate. The heritability ... 0.20 (with very precise error). pubmed.ncbi.nlm.nih.gov/36378351/
Image
Read 11 tweets
Jan 29
The racists in Stancil's replies have started appealing to "scientific consensus". So let's look at what the consensus of *high-quality evidence* is on genetic racism. A 🧵:
On genetics/race/behavior, over a hundred population geneticists denounced Nick Wade's A Troublesome Inheritance (a sort of genetic racism catechism). Their conclusion: "there is no support from the field of population genetics for Wade’s conjectures"

cehg.stanford.edu/sites/g/files/…
David Reich, a preeminent population geneticist, went on to write an entire book on the topic of genetic ancestry. His conclusion: "the ancient DNA revolution ... is fueling a critique of race ... Mixture is fundamental to who we are" Image
Read 8 tweets
Jan 27
Let me expand on this since I think it's a useful lens through which think about heritability estimates. When we talk about "dominance" we're really talking about genetic effects that deviate from additivity: an effect only kicks in when you have both/neither allele. A 🧵:
Image
Most common traits in humans are driven by tens/hundreds of thousands of genetic variants of small effect, so we are interested in dominance heritability i.e. the contribution of *all* of these non-additive effects together, which we can contrast with the additive contribution.
There's a long-standing debate over the extent and causes of dominance effects in human traits, summarized well in a recent study of Palmer et al []. Certainly we see plenty of non-additivity at the biological level, but what about genetic effects? science.org/doi/10.1126/sc…
Image
Read 10 tweets

Did Thread Reader help you today?

Support us! We are indie developers!


This site is made by just two indie developers on a laptop doing marketing, support and development! Read more about the story.

Become a Premium Member ($3/month or $30/year) and get exclusive features!

Become Premium

Don't want to be a Premium member but still want to support us?

Make a small donation by buying us coffee ($5) or help with server cost ($10)

Donate via Paypal

Or Donate anonymously using crypto!

Ethereum

0xfe58350B80634f60Fa6Dc149a72b4DFbc17D341E copy

Bitcoin

3ATGMxNzCUFzxpMCHL5sWSt4DVtS8UqXpi copy

Thank you for your support!

Follow Us!

:(