Post

How to get URL link on X (Twitter) App

On the Twitter thread, click on or icon on the bottom
Click again on or Share Via icon
Click on Copy Link to Tweet
Paste it above and click "Unroll Thread"!
More info at Twitter Help

Sasha Gusev

Apr 27 • 18 tweets • 8 min read • Read on X

Scrolly

https://twitter.com/timothycbates/status/1916500806196404395

Nice! Here we have an interesting paper using genetic ancestry to classify race/ethnicity in modern data and algorithms. Let's take a look at what this paper found: 🧵

https://twitter.com/timothycbates/status/1916500806196404395

First, I don't want to get too hung up on language, but TCB's tweet starts talking about "ethnicity", then shifts to "continental ancestries", and then entirely omits the largest ethnic group in the US: Hispanics. These terms have distinct definitions (). nap.nationalacademies.org/catalog/26902/…

Anyway, how well can this paper actually impute ethnicity from genetic ancestry in a large cancer population ()? ~17% of the time it gets Hispanic classification completely wrong or a no-call! worldscientific.com/doi/10.1142/97…

But even this is an overstatement, because the majority of participants either didn't list race/ethnicity or provided one that didn't fall into an established category. And the ML algorithm is *terrible* at classifying these unlabeled/partially labeled people as no calls.

This creates an interesting paradox where the algorithm can be made to look like it is more accurate over time, but in reality participants are simply drifting to a new unlabeled space in the social construct.

I was also intrigued by the claim that ethnicity is perhaps the least socially constructed variable in social science because an algorithm can classify some of the labels with some accuracy. Is this really true?

Language is a social construct, but AI is able to do a pretty good job at classifying languages.

Religion is a social construct, but AI can do a pretty good job classifying those too, even from a cartoony illustration.

Race is a social construct, but I bet you could easily classify thes-- wait, where was I going with this?

Even more interesting, you can explain a social construct like "money" to an AI and it will figure out the natural divisions within the construct based on visual details.

Can we do the same for race/ethnicity and ancestry? Let's play a game ...

Here's a basic ancestry plot, where each point is a person. Do the green and purple dots reveal two racial groups?

Nope. The green and purple points are sampled from the same population but the purple dots just came from one family in that population.

Okay but that was simulated data. Here's another one, using real data from a large-scale biobank this time. Are these ten different racial groups? Surely the pink-ish groups are a different race from the greens at least?

Nope! These are all Chinese participants of the Kadoorie biobank, color-coded by the cities they were recruited from. Ancestry inference can be extremely sensitive with enough data.
() pubmed.ncbi.nlm.nih.gov/37601966/

Ok, maybe it's unfair to use such closely related populations. Let's look at data from continental groups and use a model-based clustering approach. Surely the two orange/tan clusters here are different races or continents:

Nope! The two groups being distinguished here are Melanesians and ... the rest of the world. Asian, Middle Eastern, European, and African participants all get clustered together because of the sampling of the data. () pmc.ncbi.nlm.nih.gov/articles/PMC60…

I'm making it too hard. Maybe we need more drifted populations and tree-based clustering instead? Look at the deep divergences across these populations, surely *these* must be different races?

Nope. These are all participants from Native American tribes within a single linguistic group. Some of the most diverged populations in the world get lumped together into one socially constructed box.

() nature.com/articles/natur…

You get the idea.

TLDR: When people say a construct is "the least constructed / best / most replicable in social science", maybe they are telling you more about the quality of the social sciences than the validity of the construct. /x

• • •

Missing some Tweet in this thread? You can try to force a refresh

This Thread may be Removed Anytime!

Twitter may remove this content at anytime! Save it as PDF for later use!

More from @SashaGusevPosts

Sasha Gusev

@SashaGusevPosts

Oct 14

https://twitter.com/TheAtlantic/status/1977762525241381296

Eric Turkheimer has a good piece about a bet he made with Charles Murray regarding the genetic understanding of IQ (or, really, the lack of it). Murray being so wrong in his prediction should make us question his world model, but it's also worth commenting on his response.

https://twitter.com/TheAtlantic/status/1977762525241381296

Murray has, for some time now, been workshopping the excuse that progress on IQ genetics was blocked by researchers being denied the access to the relevant databases. This is patently untrue!

https://x.com/SashaGusevPosts/status/1775652749813911809

First, one of the largest genetic analyses to date of *any* trait is of educational attainment, a phenotype Murray himself has used as a proxy for intelligence. Surely a study of 3 million should have been enough to satisfy Murray's prediction.

https://x.com/SashaGusevPosts/status/1775652749813911809

Read 7 tweets

Sasha Gusev

@SashaGusevPosts

Sep 18

https://twitter.com/charlesmurray/status/1968053071847760235

Murray and most of race twitter has apparently been fooled by this completely fabricated analysis purporting to show African ancestry is associated with IQ. People lie on twitter all the time, but this is both more revealing and more disturbing than usual. A 🧵

https://twitter.com/charlesmurray/status/1968053071847760235

Revealing in that it shows how quantitative racism is a just an exercise in manipulating data to fit the preconceived conclusion. Disturbing because this time private data is being used and the results, which cannot be easily verified, are just flatly invented.

What's actually going on? Some guy claims to have an analysis showing that African ancestry differences between siblings are associated with IQ differences in the UK Biobank. Implying an ancestry difference in the within-family influences.

Read 24 tweets

Sasha Gusev

@SashaGusevPosts

Aug 1

https://twitter.com/AlexTISYoung/status/1950950315395875304

A few thoughts on Herasight, the new embryo selection company. First, the post below and the white paper imply that competitors like Nucleus have been marketing and selling grossly erroneous risk estimates. This is shocking if true! 🧵

https://twitter.com/AlexTISYoung/status/1950950315395875304

I wrote last year about the un-seriousness with which Nucleus approached their IQ product and the damage it could do to genetic prediction and research more broadly (). This appears to have been a broader pattern beyond IQ, extending even to rare disease.theinfinitesimal.substack.com/p/genomic-pred…

People who care about this technology should be furious at Nucleus and their collaborators (as well as Orchid and Genomic Prediction for their own errors). Finding such flaws should not require reverse-engineering by a competitor. These products clearly need independent audits.

Read 14 tweets

Sasha Gusev

@SashaGusevPosts

Jun 24

https://twitter.com/krichard121212/status/1937264266375213448

Oof. Polygenic scores for IQ lose 75% of their explained variance when adding family controls, even worse than the attenuation for Educational Attainment. These are the scores Silicon Valley is using to select embryos 😬.

A few thoughts on this study ...

https://twitter.com/krichard121212/status/1937264266375213448

The TEDS cohort used here is a very large study with high-quality cognitive assessments collected over multiple time points. It is probably the most impressive twin study of IQ to date. That means very little room for data quality / measurement error issues.

https://x.com/DamienMorris/status/1934946942326341766

It is important to highlight surprising null results. Just last week we were hypothesizing that large IQ score attenuation could be a study bias or an artifact of the Wilson Effect. Now we see it replicate in an independent study with adults.

https://x.com/DamienMorris/status/1934946942326341766

Read 12 tweets

Sasha Gusev

@SashaGusevPosts

Jun 17

@notcomplex_ @krichard1212 The authors fit a non-identifiable Model B, which produces a table full of NA's. Then they try to interpret this model to fix it. That makes no sense. The parameters of this model will be completely arbitrary, so using it to decide what to prune is also statistically invalid.

@notcomplex_ @krichard1212 At various points later on they talk about "Heywood cases", which are out-of-bounds parameters or negative variances, but no such out-of-bounds parameters are actually present in the tables (and, again, you cannot interpret these from the non-identified model).

@notcomplex_ @krichard1212 So none of the decisions make statistical sense and either reflect someone who doesn't know what they're doing or is intentionally trying to find the model fit they like. True to form given they missed a fatal error with model A, misinterpreted AIC comparisons, etc.

Read 4 tweets

Sasha Gusev

@SashaGusevPosts

Jun 11

Racism twitter has taken to arguing that observed racial differences must be "in part" explained by genetic differences, though they demure on how much. Not only is this claim aggressively misleading, it is completely unsupported by data. A 🧵:

Genetic differences between any two populations can go in *either* direction, matching the phenotypic differences we observe or going against them. Genes also interact with the environment, which makes the whole notion of "explaining" differences intractable.

The mere fact that a trait is heritable within populations tells us nothing about the explanatory factors between populations. See: Lewontin's thought experiment; Freddie de Boer's analogy to a "jumping contest"; or actual derivations (). pubmed.ncbi.nlm.nih.gov/38470926/

Read 13 tweets

Support us! We are indie developers!

This site is made by just two indie developers on a laptop doing marketing, support and development! Read more about the story.

Become a Premium Member ($3/month or $30/year) and get exclusive features!

Become Premium

Don't want to be a Premium member but still want to support us?

Make a small donation by buying us coffee ($5) or help with server cost ($10)

Donate via Paypal

Or Donate anonymously using crypto!

Ethereum

0xfe58350B80634f60Fa6Dc149a72b4DFbc17D341E copy

Bitcoin

3ATGMxNzCUFzxpMCHL5sWSt4DVtS8UqXpi copy

Thank you for your support!

Share this page!

Enter URL or ID to Unroll

Sasha Gusev

Try unrolling a thread yourself!

More from @SashaGusevPosts

Sasha Gusev

Sasha Gusev

Sasha Gusev

Sasha Gusev

Sasha Gusev

Sasha Gusev

Did Thread Reader help you today?

Don't want to be a Premium member but still want to support us?

Send Email!