Yesterday @kph3k noted that I "loathe" decile analysis as a way of describing the results of a PGS analysis. The subsequent discussion clarified why I loathe it-- it's a misleading way of reporting results, a systematic sleight of hand to disguise the import of a small effect. /1
To illustrate, I created some simple data for 10,000 observations. PGS has a mean of 0 and a SD of 1; IQ has a mean of 100 and a standard deviation of 15. They are correlated around .205, so the PGS accounts for about 5% of the variance in IQ. /2
PGS analyses are very simple-- it's just a simple regression. How to illustrate the result? The obvious way is to just draw the scatterplot with the regression line through it. It is what it is. /3
But how to describe the import of the PGS, the difference it makes in real people's IQs? This is where decile analysis comes in. Why not report the predicted scores for individuals at the 10th and 90th percentiles? The results seem impressive! /4
Look at that: 5% of the variance may not seem like much, but the predictions for the 10th and 90th percentiles differ by 9 IQ points. But wait: all you are doing here is reporting the location of the regression line, ie, the MEAN predicted scores of people at the extremes. /5
This is giving yourself credit for your greatest source of certainty (the big sample) while ignoring the greatest source of uncertainty (the poor prediction of the PGS). It is answering the wrong question: it doesn't tell you how well the score would work in the real world /6
What you really want to know is the *prediction interval*, the uncertainty surrounding the prediction for a single new participant.The predictions are the same, but the intervals are way wider, because they take into account how badly the PGS actually works. /7
This is a much more sobering result. For someone at the 10th percentile of the PS, you can predict with 95% confidence that their IQ score will be between 67 and 124; for someone at the 90th %, you can predict it will be between 76 and 133. The big sample is no help. /8
Bottom line: decile predictions in the absence of prediction error is a QRP, part of an unintentional but systematic program of sweeping the biggest problem of human behavioral genomics-- tiny effect sizes-- under the methodological rug. Smart researchers should cut it out. /end
/ps. Another thing about decile analyses is that they are generally presented as though they were showing you something special about the data. "The R2 may be small, but look-- there is something interesting at the extremes of our data."
/ps2 I have never seen one where that is actually the case. Mean effects at the extreme are just a general property of small correlations. Researchers should be clear that they are special pleading that their small correlation is important anyway.
• • •
Missing some Tweet in this thread? You can try to
force a refresh
Haven’t had a grouchy GWAS thread from me in some time, but this GWAS of occupational status punches my buttons. biorxiv.org/content/10.110…
I don’t have a lot nice to say, so I should be clear that none of this really pertains to these particular authors. The issues that I have are with industry-standard practices. /2
My first problem is the use of “up to” in reporting results.The authors use the phrase 12 times. It has become common in GWAS: compute something a bunch of different ways, then only report the best result, saying it is “up to” the max value. /3
I am teaching the grad psychopathology seminar, and despite everything I still insist on spending the first week talking about Freud and psychoanalysis with a group of very smart and very skeptical young people. Here is what I tell them. No claim of originality. /1
If a significant part of what we call mental illness consists of "problems in living," what is the nature of those problems? Why, in a world in which we are so lucky to be alive and have so many possibilities for joy, do we spend our time worried at best, miserable at worst? /2
Freud invented a quasi-scientific method to find out. He spoke to people in private. He withheld all moral judgement, and encouraged them to be absolutely honest about what they thought, felt, wanted, feared, and lusted after. Here is what he found. /3
A thought on the @NatureHumBehav. ALL journals have a policy like this, it's just that most of them don't write it down. The idea that journals should publish everything that is scientifically valid is a fantasy. Real journal articles don't arrive labelled valid or invalid. /1
I don't mean that in some high-falutin' "science always exists in a cultural context" sort of way. I mean it from the point of view of a working Associate Editor. Peer review helps but doesn't solve the problem. You can find three reviewers to approve or trash anything. /2
It is especially true in social science, where everything tends to be soft. There is nothing preventing someone from collecting data that seems to them to suggest that people were better off under slavery or the Holocaust was good for the economy. /3
Still thinking about Peter Visscher's essay and reply. The last point in his reply accuses GWAS skeptics of moving goalposts. This takes some chutzpah on the part of the GWAS community. A thread: /1
You may be able to accuse someone of moving goalposts, but not me. I was kicking field goals on this subject in 1998, when GWAS was a twinkle in human genetics' eye. /2
I wrote two long chapters referencing Visscher's work in 2011. Given the date, they were pretty accurate about where we were headed. /3
Peter Visscher left a comment on my blogpost from last week, here: turkheimer.com/peter-visscher…. First things first, I got the journal where he published his article wrong. I fixed it. /1
Visscher doesn't like that I used the word "eugenic" in connection with his essay. But, broadly, eugenics refers to the use of genetics to explain existing racial, social and economic disparities. Uncritical application of between family GWAS does just that. /2
He doesn't grapple with my central point: once we know about unconfounded (really less confounded) within-family estimates, those are the ones we need to interpret and base our applications on. Good intentions don't rescue between family effects from eugenic implications. /3
So EA4 is finally out. It's a massive project, and I am not here to question its scientific validity, but rather to ask a tougher question: Has GWAS of complex human behavior turned out to be a disappointment? /1 nature.com/articles/s4158…
We are now over 3 million participants. A 3x increase in sample size produced a 25% increase in between-family R2. Remember when we were told that bigger samples would inevitably lead to scientifically or practically actionable results? Are we still waiting for that? /2
Maybe it is going to come in subsequent papers, but many of the old EA goals have been quietly abandoned. Not much interpretation of top hits, not much biological annotation. In its place just raw prediction, and it is still less than impressive. /3