I can claim that the last genealogical common ancestor of humanity lived in Africa or East Asia, and have a similar chance of being right. 2/n
On a personal level, I find the hypothesis strange. There's nothing special about being descended from this common ancestor. Millions of people were born before this (mostly meaningless) genealogical common ancestor and were just as wonderfully human.
Millions of people were born after a genealogical common ancestor but were not descended from them. These people built amazing buildings, civilizations, and art and lived incredible lives all over the world. They are just as much part of humanity. 4/n
It's a neat parlour mathematical trick and a mathematical true that genealogical CAs exist. However, saying that this reconciles science w. the idea of Adam and Eve, sweeps a lot of stuff under a very patchy, ugly carpet. 5/n
Note that most recent genealogical CA changes over the generations. The genealogical CA of people 2k years ago can be much further back in time than the genealogical CA of everyone alive today. 6/n
This older CA is also an older (not most recent) CA to all modern humans. But there is no uniqueness to this individual, just a chain of many such genealogical ancestors tracing all the way back till eukaryotes started having sex (and further back still).
In a few thousand years someone else will be the most recent genealogical CA of all living individuals. Perhaps it'll be you #LifeGoals, but probably not. ht @jashapiroxkcd.com/1545/
My lab read this paper for journal club, and had some thoughts on the strong claims made about the number of signals of selection found. 1/biorxiv.org/content/10.110…
The GLMM test that is applied in the paper, which gives the Z statistic and estimated selection coefficients, is looking for SNPs that show a consistent temporal allele frequency change. That is equivalent to Gene Environ correlation methods using time as the environment. 2/
The authors’ claim about the number of selection targets rests on the cutoff for their Z statistic from the GLMM. The distribution of their test is greatly inflated compared to their null, which they interpret to be the result of many loci experiencing sweeps. 3/
The use of the phrase “ancestry-specific variant” is increasing, particularly to describe rare Single-nucleotide variants (SNVs). But these alleles are not ancestry-specific. They have not yet been found elsewhere, but they will be. 1/n
The world has ~8 billion people. The mutation rate is ~10^-8 per base pair, so every base pair has mutated ~160 times in the past generation alone. Thus, every single (non-lethal) base pair mutation will be present in every large human grouping. 2/n
An allele currently found only in European ancestry sequences will be present in India many times over for example. Rapid population growth means that there’s a vast reservoir of rare alleles within all human “ancestry groups". No rare allele will be specific to one of them. 3/n
The All of Us paper is rightly being criticized for its UMAP figure, which suggests an overly discrete view of human variation—a problem that is compounded by colouring the plot with self-identified race and then omitting the “self-identified” from the title & legend. 1/n
The paper presents a major NIH resource, but it does not take on board the carefully thought-through advice of the NAS panel that the NIH commissioned (presumably for exactly this kind of purpose). 2/n nationalacademies.org/our-work/use-o…
The admixture plots go some way to showing that people & self-id race do not map discretely onto clusters. But here again the choice to equate ancestry & self-identified race is misleading (e.g. choice of clustering level & colour matching in legend) 3/n
The NAS report on the use of Population Descriptors in Human Genetics & Genomics is definitely worth a read by everyone working in human genetics and adjacent fields (chapter 5 is a good place to dive in if you’re short on time). 1/n nap.nationalacademies.org/catalog/26902/…
The report has a range of well argued recommendations, which are practically motivated & that we all can start to explore implementing in our research and papers. 2/n
It emphasizes the ongoing need to stop conflating the social constructs of race & ethnicity with genetic ancestry, and to use sample descriptors that do not mix these concepts. 3/n
The genetic ancestry labels that the NIH “All of Us” biobank genetic data returns seem likely to add to general confusion about genetics and race/ethnicity [see appendix A here: ] 1/nresearchallofus.org/wp-content/the…
“All of Us” assigns study participants to one of a few discrete ancestry groups using their genome data (based on genetic similarity to the 1000 genomes & HGDP categories). 2/n
These labels are a poor description of fairly continuous genetic diversity. See the figure below, where the discrete labels are shown on a PCA plot of the “All of Us” data 3/n
@kathleenmkay@DavidBLowry So if you're wanting to show a single locus, dominance variance is the easiest to add-in.
@kathleenmkay@DavidBLowry You can say additive variance is the variance explained by assuming a linear relationship between genotype and phenotype (red line) and dominance variation is extra variance explained by the deviation away from additivity (dev. away from red line)
@kathleenmkay@DavidBLowry I say that the additive variance is the variance due to the additive effect of sharing alleles. For example, parents contribute a single allele at each locus to their children and so you can ask how much they covary (resemble each other) due to sharing that allele.