Correspondence Analysis (CA) is an alternative to PCA that is robust for use with raw or log-normalized scRNAseq counts
& is consistent with studies that recommend decomposition of the Pearson Residuals (Townes et al., 2019, Lause et al., 2021 and Hafemeister & Satija (2019) )
CA has a long tradition in diverse settings and disciplines, including linguistics, business and marketing research, and archaeology
There are many variations of CA that are better adapted to handle overdispersion that classic CA (decomposition of the Pearson Residuals)
We tested these variations of CA, variance stabilizing transformations applied in conjunction with standard CA or using different chi-sq statistics.
We report that CA of the Freeman-Tukey chi sq residuals are better adapted to overdispersion of scRNAseq counts
CA biplot provides easy cluster interpretation.
Transformed counts have an intuitive interpretation
the chi sq statistic, strength of association, between gene & cell
Genes & cells in same direction from origin are associated
Distance from the origin = magnitude of assoc.
CA is better adapted to scRNAseq -> library depth batch effects are better addressed
The scMix data (CellBench @Bioconductor pkg) has 3 lines cells are assayed on different platforms
PCA -batches separated by different library depths
CA - multiBatchNorm correction not needed
Plugging it into existing pipelines is easy, it's a straightforward replacement for PCA. It may improve pipelines. We tested this with scRNAseq dataset alignment. Replacing PCA with CA in the Harmony pipeline improves dataset alignment without impacting speed.
Finally corral is simple, determined and fast.
Determined, direct methods deliver an exact solution, with the same results each time.
Iterative methods (such as glmPCA) have an initial seed & vary between runs. We run these several times and take an average score.
Lauren and I love your feedback... This is her work.
Mini tweetutorials on Eugenics, Statistics, Medicine Eugenics, "well born", was coined by Francis Galton, a cousin of Darwin. In 1873 he wrote, "hereditary-improvement". It claims the wealthy are a superior "breed" with higher intelligence. Genetics does not support his claims
His comments on post-famine Irish were popular in Britain at the time, and many non-English groups were portrayed by negative stereotypes. He repeated a popular racist caricature. Modern genetic maps show that Ireland and Britain are genetically close.
Galton's flawed thesis had vast impact. He founded and was the first president (1822-1911) of the British Eugenics Society. Members included H.G Wells (1886-1946), politician Winston Churchill (1874-1965), birth-control advocate Mary Stopes (1880-1958). eugenicsarchive.ca