The most remarkable "discovery" is this: the correlation between the (CAD) polygenic scores of siblings is 0.5. And that between second-degree relatives is 0.25.
The next experiment is less trivial. They asked: if one sib has PRS in the top 20%, what is the probability of the other sib to also have PRS in the top 20%?
3/7
Their result was 44%. Here is an R code that provides the same answer:
risk_sib = function(q) {
integrand = function(t)
return(dnorm(t)*pnorm(qnorm(1-q)*sqrt(2)-t, lower.tail=F)^2/q)
return(integrate(integrand,-Inf,Inf)$value)
}
4/7
To be fair: (1) It took a few lines to sketch the proof. See the Supplement our preprint biorxiv.org/content/10.110… (2) This formula was somewhat off for the top 1% (predicting 13% instead of 21%) - perhaps the distribution deviates from normality in the tail.
5/7
Next they determined whether the PRS can identify which sib is affected in discordant sib-pairs. This is important, but there's already some literature on this, most notably nature.com/articles/s4159…
6/7
These results interests me, as they have implications to the utility of screening IVF embryos with PRS. The paper seems to have cited our previous work on the subject, to which I'm honored.
Curious what the rest of the paper looks like.
7/7
Thanks @adamkvonend for sending the paper. There was not much beyond the first page. But the figure with the full data is interesting. The setting where my little R snippet was off is an outlier, so I actually did pretty well.
8/10
My predictions for the risk of one sib to have a PRS above a threshold (1,5,10, or 20% percentile), given that the other sib has crossed the threshold, were 12.9%, 24.4%, 32.4%, and 43.6%. (Independent of the score.) Quite close to the actual data!
9/10
That said, the problem of the utility of cascade screening after PRS testing is really interesting. It can be further developed. E.g., I think what is important is the probability that a sibling of the proband will be affected given a high PRS in the proband.
10/10
• • •
Missing some Tweet in this thread? You can try to
force a refresh
Main results:
(1): A model for the segment length distribution under admixture that was ongoing for multiple generations.
(2): Extensive simulations for testing inference accuracy for admixture time + duration, ...
including when introducing errors in the recombination map. It seems that long admixture introduces high variability in the inference of the admixture duration, in particular when the recombination rate is mis-specified.
2/4
(3) They looked at admixture LD decay in Neanderthal ancestry in modern humans. The empirical data is consistent with admixture lasting anywhere between one and thousands of generations.
Here are impressions from an *in-person* class I just taught - my first since Jan 2020.
The setting: statistics for medical students. Total enrolled: 220; total in class: ~120; watching from home: ~60.
1/8
(1) Strangely, while the students were first year undergrads, they already knew each other very well (not sure where from). It didn't feel at all like I was teaching students on their very first day on campus.
2/8
(2) In fact, they knew each other "too well". They were talking non-stop and it was difficult to get them to be quiet. I was greatly missing the "mute" button! Similarly, it felt weird to *actually wait* for the students to get seated.
3/8
What can we learn from sequencing (100% genetically identical?) monozygotic twin pairs?
Turns out, a lot, particularly on early embryo development. Here, deCODE deeply sequenced ~400 twin pairs, along with their children/parents when available.
The authors found variability in the no. of postzygotic mutations (not in the twin's parents), e.g. 39 twin pairs differed by >100 mutations, 38 pairs did not differ at all.
The number of mutations increased with age, indicating that most of them accumulate through life.
2/10
More interesting are mutations that appear in a single twin + a child of that twin. These mutations must have occurred during early development, before the specification of the primordial germ cells, as they appear in both soma (blood/cheeks) and germline (children).
Large-scale GWASs yield increasingly accurate polygenic scores (PS), and it is now feasible to calculate such scores from genome-wide data of IVF embryos.
One company is already offering embryo screening for disease risk scores tinyurl.com/yygzdw7q 2/11
It is a short leap to imagine applying this technology outside disease risk. Prospective parents interested in “enhancing” the height or IQ of their future children might seek to generate and genotype IVF embryos, and use only the top-scoring one. 3/11