There are some straightforward 'stop using this term' aspects (the use of "Caucasian" for example); there are some complex "what does this term mean" (ethnicity labels, the ethnicity/race duality in US vs just ethnicity in UK / Europe) and then technical stuff on GWAS >>
The technical piece is about how we describe the common place GWAS protocol of subselecting a group people in cohorts for association analysis; a reminder that the standard process has two steps to achieve pseudo-randomisation of non-genetic factors to genetic factors
Step 1. Calculate genetic principle components and select a group of people with lower observed genetic variance in the cohort being studied (aka, the "blob" in PC1/PC2). Step 2. Add many genetic components into the linear models to "model out" any residual non-genetic effects.
This two step procedure works robustly (works means... the statistical test fits the expected null distribution most of the time, aka, "qq plot is good" or "LDscore based genomic inflation is low" and often some interesting hits are found and nearly always replicate) ...but...
We have to describe in our papers both what we are doing here and also the resulting subset of the cohort. We tend to sort of just gloss over the what we are doing bit and then get into all sorts of trouble in coming up with concise ways to talk about the subset of the cohort.
We advocate using *more* technical language to be clear about this procedure, eg "The European-associated PCA
cluster, which aims to minimise variation in non-genetic factors and genetic factors". The "European" label here is a cultural label, *not* a genetic label
We are pretty convinced that the routine use of this two step procedure and its basis in genetic principle components tends to make people think this sub-group is being defined to control genetics, when it is at least as important that it is controlling non-genetic factors
This is because most societies have complex social stratification and groupings, and much of this is related to recent migrations, either forced and horrible (eg, slavery) or voluntary (eg, economic migration). It's a big part of human society.
This "PCA trick" leverages the recent ancestry signals to select a group of people with a more consistent non-genetic environment, mainly due to social factors. This is cute and clever, but we have to be super-careful in not implying these groups are genetic.
I find moving between farm animal/laboratory genetics and human genetics very informative here. In these settings we can bring stronger pseudorandomisation (farm animal) or explicit randomisation (laboratory) assumptions to our genetics studies.
As such, one can deploy more sophisticated statistical models (eg Linear Mixed Models) which are fragile at best in the human setting where recent ancestry correlates with many phenotypes for plenty of reasons which are *not* genetic.
For most non-geneticists, this all gets a bit bamboozling, which is... because it is a bit complex... but we often sort of slouch back into genetic language (eg "genetic background" in human genetics paper is often not about *genetics* ... it is about this social stratification)
In the paper we have forced ourselves to be explicit in recommendations (I am inspired by a conversation here with @genemodeller that he is very "biddable" in this area) knowing that we will definitely get feedback on this - so please do comment!
• • •
Missing some Tweet in this thread? You can try to
force a refresh
More thoughts post the extension of Stage 3 in the UK on COVID restrictions to July 19th.
Not enough in the press in my view is made of the change in *biology* of the virus; the virus is both more transmissible *and* is less dampened by 1st doses (in particular Ox/AZ, but Pfzier also) >>
This means that current UK population, both the number of 1st doses and where the 1st dose window is is a far harsher transmission environment for the Alpha and other variants than Delta; we've seen this play out in the numbers
Sitting in softly air conditioned room on the Genome Campus, Cambridgeshire hot and sunny outside, musing about Corona this week. TL;DR - the UK looks like it is sensibly going to delay; the Delta variant is more transmissible and this implies at least one more wave worldwide.
Context: I am an expert in genetics and computational biology; I know experts in viral genomics, infectious epidemiology, clinical trials and immunology. I have COIs - I am a longstanding consultant to Oxford Nanopore and I was on the Ox/Az trial
Reminder: SARS-CoV-2 is high infectious virus which causes a severe disease, COVID, in a subset of people, often leading to death. No healthcare system in the world could cope with unfettered transmission of the virus, so a variety of control measures have been performed
It's been a hot half term in the UK; at the end of this week here are my thoughts on Coronavirus. TL;DR the delta variant has changed the calculus in the UK but we don't know how much the vaccination calculus makes this less of a concern. Still more concern globally than UK.
Context: I am an expert geneticist + computational biologist; I know experts in infectious epidemiology, viral genomics, immunology and clinical trials. I have COIs. I am long established paid consultant to Oxford Nanopore and I am on Oxford/AZ clinical trial
Reminder: SARS-CoV-2 is an infectious virus which causes a horrible, sometimes lethal, disease; COVID. Left unchecked every healthcare system would not be able to process the number of diseased individuals.
A Covid view, back in lovely Northumberland. TL;DR - Europe continues to vaccinate; UK, further in vaccination, has some concerning outbreaks associated with imported strains from India; much of the world continues to worsen with lack of vaccine supply.
Context: I am an expert in human genetics and computational biology. I know experts in infectious epidemiology, viral genomics, clinical trials, immunology. I have COIs: I am paid consultant to Oxford Nanopore and I was on the Ox/Az vaccine clinical trial.
Reminder: SARS-CoV-2 is an infectious virus which causes a horrible disease, COVID19, in a subset of people often leading to death. If we let the virus transmit unimpeded many people would die/hospitalised; no healthcare system could cope with the rate of hospitalisation.
A view from COVID from sunny and wind blown Northumberland this time, not my normal London view. TL;DR - developed countries are making their way across the vaccine bridge to a better 2021 (~variants); the storm still rages in many other countries; the world has to work as one.
Context: I am an expert in human genetics and computational biology. I know experts in viral genomics, infectious epidemiology, clinical trials, public health+ immunology. I have COIs: I am a consultant to Oxford Nanopore, who makes sequencing machines+ I was on the Ox/AZ trial
Reminder: SARS-CoV-2 is infectious human virus which causes a horrible disease, COVID, in a subset of people, many of whom die. If one let the virus propagate naturally not only would many people die but no healthcare system could process the huge number of sick people so quickly
Sunday morning in London; strong sunlight but sharp air and the Coronavirus situation is still on track in the UK; I have more concerns across Europe, but there are good solutions (namely vaccination). The global situation is far far more concerning.
Context: I am an expert in genetics and computational biology. I know experts in infectious epidemiology, viral genomics, clinical trials, testing. I have some COIs; I am long established consultant to Oxford Nanopore which makes sequencing machines and I was on the Ox/Az trial.
Reminder: SARS-CoV-2 is an infectious virus which causes a horrible disease (COVID) in a subset of people, often leading to death. If we let infection progress at the virus' natural rate many people would die, and no healthcare system can cope with this rate of disease.