Post

How to get URL link on X (Twitter) App

On the Twitter thread, click on or icon on the bottom
Click again on or Share Via icon
Click on Copy Link to Tweet
Paste it above and click "Unroll Thread"!
More info at Twitter Help

Sasha Gusev

Jan 26, 2024 • 18 tweets • 10 min read • Read on X

Scrolly

I've written up a "crash course" on population genetics parameters useful for thinking about recent selection, heritability, and group differences (as part of a longer write-up on these concepts).

I'll summarize the key points here 🧵: gusevlab.org/projects/hsq/#…

A preface: if you're generally interested in population genetics it's better to learn from first principles, and I've linked some useful resources to that end (many free). In particular (spoiler) recent evolution excludes some of the more interesting concepts and personalities.

But one downside of the general approach is that it can be hard to get a feel for real time (for example when populations are modeled in terms of 4Ne\mu). Here we'll fix three parameters based on data: time (t=65k years), population size (Ne=10k), and selection (s=~10^-4).

We can start by modeling how genetic variants move under neutral drift: very slowly! In 120k years a 5% allele is expected to accumulate just ~1% of drift variance. We can also think in terms of allele "age", and common variants are VERY old (mostly pre-migration).

Now let's add selection. Under the weak coefficients we see in real data, selection acts very slowly. Most common variants under negative selection will stay common. And new variants under positive selection will stay rare. It would take ~300k years for a 95% allele to go to 1%.

These shifts are even slower under stabilizing selection, where traits move towards a fitness optimum instead of directionally up/down. This is likely the way populations have adapted to changing environments (we'll come back to this later).

Now that we have a model for selection and drift, we can test for whether variants are under selection. It turns out this test is very powerful when selection is strong, even 100 samples is enough. Whereas in the "nearly neutral" range it is effectively undetectable.

We can quantify genetic variance using F_ST, a fundamental measure of within versus between population correlations and often misunderstood. Part of the confusion is there are two derivations - Nei's and Hudson's - and they can be meaningfully different.

Under strong assumptions, F_ST can be related to population size and migration, but it is compatible with many different population dynamics in a way that can be non-linear and unintuitive.

Moreover, F_ST is highly dependent on *which variants* are used to estimate it, and this can lead to highly unintuitive results. For example the apparently higher F_ST within chimps than between chimps/humans -- an artifact of how sub-populations are tested.

A useful derivation is that Hudson's F_ST bounds the difference in trait mean between populations under neutral drift. We can confirm this in simulations. For a typical ~10% heritable trait, the (African/European) population difference is at most 1.5% (in either direction).

That's under neutrality, but under stabilizing selection, things are constrained even further but in complicated ways. After a shift in the fitness optimum, genetic variation is first rapidly selected on, and then gradually (and mostly arbitrarily) purified out of the population.

Between populations with the same fitness optimum, the mean trait value will be more constrained than under neutrality. But, it will also look like genetic variation has changed MORE substantially (e.g. F_ST). Interpretation is even more complicated with environmental shifts.

Finally, this brings us to the Breeder's Equation, which connects heritability and the response to selection under a fixed environment. In controlled breeding experiments (e.g. maize) response can be stable for many generations (consistent with polygenicity and new mutations).

But in natural populations, the response often appears static or even negative (aka the "stasis paradox")! I highlight some examples compiled by Walsh & Lynch: bias in heritability estimates, indirect/environmental confounding, or shifts in the environment are all at play.

This "missing response" echoes the debate around "missing heritability", where molecular methods often produce lower estimates and identify environmental confounders. I wonder ... are humans more like maize under controlled breeding or like natural evolving populations? /fin

By the way, all the figures and simulations are pretty simple but I've put the code here in case it's useful: . Let me know if you spot an error.github.com/gusevlab/hsq_a…

@threadreaderapp unroll

• • •

Missing some Tweet in this thread? You can try to force a refresh

This Thread may be Removed Anytime!

Twitter may remove this content at anytime! Save it as PDF for later use!

More from @SashaGusevPosts

Sasha Gusev

@SashaGusevPosts

Nov 21, 2025

I wrote a little bit about the "missing heritability" question and several recent studies that have brought it to a close. A short 🧵

For nearly two decades the field has been asking why heritability estimates from molecular studies are so far below estimates from twin studies (). Are molecular studies missing important genetic variation or are twin studies biased by strong assumptions? nature.com/news/2008/0811…

Recently, multiple innovative methods have been developed to estimate "narrow-sense" heritability directly from genetic data. These methods make varying assumptions about environments and interactions, and thus allow us to triangulate on the true parameter.

Read 10 tweets

Sasha Gusev

@SashaGusevPosts

Oct 14, 2025

https://twitter.com/TheAtlantic/status/1977762525241381296

Eric Turkheimer has a good piece about a bet he made with Charles Murray regarding the genetic understanding of IQ (or, really, the lack of it). Murray being so wrong in his prediction should make us question his world model, but it's also worth commenting on his response.

https://twitter.com/TheAtlantic/status/1977762525241381296

Murray has, for some time now, been workshopping the excuse that progress on IQ genetics was blocked by researchers being denied the access to the relevant databases. This is patently untrue!

https://x.com/SashaGusevPosts/status/1775652749813911809

First, one of the largest genetic analyses to date of *any* trait is of educational attainment, a phenotype Murray himself has used as a proxy for intelligence. Surely a study of 3 million should have been enough to satisfy Murray's prediction.

https://x.com/SashaGusevPosts/status/1775652749813911809

Read 7 tweets

Sasha Gusev

@SashaGusevPosts

Sep 18, 2025

https://twitter.com/charlesmurray/status/1968053071847760235

Murray and most of race twitter has apparently been fooled by this completely fabricated analysis purporting to show African ancestry is associated with IQ. People lie on twitter all the time, but this is both more revealing and more disturbing than usual. A 🧵

https://twitter.com/charlesmurray/status/1968053071847760235

Revealing in that it shows how quantitative racism is a just an exercise in manipulating data to fit the preconceived conclusion. Disturbing because this time private data is being used and the results, which cannot be easily verified, are just flatly invented.

What's actually going on? Some guy claims to have an analysis showing that African ancestry differences between siblings are associated with IQ differences in the UK Biobank. Implying an ancestry difference in the within-family influences.

Read 24 tweets

Sasha Gusev

@SashaGusevPosts

Aug 1, 2025

https://twitter.com/AlexTISYoung/status/1950950315395875304

A few thoughts on Herasight, the new embryo selection company. First, the post below and the white paper imply that competitors like Nucleus have been marketing and selling grossly erroneous risk estimates. This is shocking if true! 🧵

https://twitter.com/AlexTISYoung/status/1950950315395875304

I wrote last year about the un-seriousness with which Nucleus approached their IQ product and the damage it could do to genetic prediction and research more broadly (). This appears to have been a broader pattern beyond IQ, extending even to rare disease.theinfinitesimal.substack.com/p/genomic-pred…

People who care about this technology should be furious at Nucleus and their collaborators (as well as Orchid and Genomic Prediction for their own errors). Finding such flaws should not require reverse-engineering by a competitor. These products clearly need independent audits.

Read 14 tweets

Sasha Gusev

@SashaGusevPosts

Jun 24, 2025

https://twitter.com/krichard121212/status/1937264266375213448

Oof. Polygenic scores for IQ lose 75% of their explained variance when adding family controls, even worse than the attenuation for Educational Attainment. These are the scores Silicon Valley is using to select embryos 😬.

A few thoughts on this study ...

https://twitter.com/krichard121212/status/1937264266375213448

The TEDS cohort used here is a very large study with high-quality cognitive assessments collected over multiple time points. It is probably the most impressive twin study of IQ to date. That means very little room for data quality / measurement error issues.

https://x.com/DamienMorris/status/1934946942326341766

It is important to highlight surprising null results. Just last week we were hypothesizing that large IQ score attenuation could be a study bias or an artifact of the Wilson Effect. Now we see it replicate in an independent study with adults.

https://x.com/DamienMorris/status/1934946942326341766

Read 12 tweets

Sasha Gusev

@SashaGusevPosts

Jun 17, 2025

@notcomplex_ @krichard1212 The authors fit a non-identifiable Model B, which produces a table full of NA's. Then they try to interpret this model to fix it. That makes no sense. The parameters of this model will be completely arbitrary, so using it to decide what to prune is also statistically invalid.

@notcomplex_ @krichard1212 At various points later on they talk about "Heywood cases", which are out-of-bounds parameters or negative variances, but no such out-of-bounds parameters are actually present in the tables (and, again, you cannot interpret these from the non-identified model).

@notcomplex_ @krichard1212 So none of the decisions make statistical sense and either reflect someone who doesn't know what they're doing or is intentionally trying to find the model fit they like. True to form given they missed a fatal error with model A, misinterpreted AIC comparisons, etc.

Read 4 tweets

Support us! We are indie developers!

This site is made by just two indie developers on a laptop doing marketing, support and development! Read more about the story.

Become a Premium Member ($3/month or $30/year) and get exclusive features!

Become Premium

Don't want to be a Premium member but still want to support us?

Make a small donation by buying us coffee ($5) or help with server cost ($10)

Donate via Paypal

Or Donate anonymously using crypto!

Ethereum

0xfe58350B80634f60Fa6Dc149a72b4DFbc17D341E copy

Bitcoin

3ATGMxNzCUFzxpMCHL5sWSt4DVtS8UqXpi copy

Thank you for your support!

Share this page!

Enter URL or ID to Unroll

Sasha Gusev

Try unrolling a thread yourself!

More from @SashaGusevPosts

Sasha Gusev

Sasha Gusev

Sasha Gusev

Sasha Gusev

Sasha Gusev

Sasha Gusev

Did Thread Reader help you today?

Don't want to be a Premium member but still want to support us?

Send Email!