I need to raise awareness about an important point in #scRNAseq data analysis, which, in my opinion, is not acknowledged enough:
‼️In practice, most cell type assignment methods will fail on totally novel cell types. Biological/expert curation is necessary!
Here's one example👇
Last year, together with @LabPolyak@harvardmed, we published a study in which we did something totally awesome: we experimentally showed how a TGFBR1 inhibitor drug 💊 prevents breast tumor initiation in two different rat models!
As you can imagine, this is a big thing. Treating tumors is already hard, preventing them is even harder!
Obviously, the most burning question for us then became: what is the drug actually doing to prevent tumor initiation?
Or, what is different in treated vs. control cells?
Long story short: we identified a group of cells popping up/expanding after treatment in both strains (ACI & SD)
How do we know these cells are unique to treatment?
Because all other subpopulations were matched, except this one. So these cells are important.
But what are they?
An obvious thing to do is differential expression between these cells and various other relevant groups (s.a. all other cells).
We did that, and got back an interesting list, consisting of many mesenchymal and stem-cell markers, but most of which also characteristic of stroma!
‼️All cell type assignment methods we tried failed to characterize this population accurately. Most of them labeled it "stroma".
But, we weren't easily fooled.
@nellage has *tens of years* of experience with this experimental model & I spent *years* researching relevant papers.
Our teams spent >2 years investigating these cells.
We discussed hundreds of hours about them.
We embarked on costly & time-consuming experiments to dig deep.
(Science obsession at its best🤫)
All because we wanted to know for ourselves: is this really a new epithelial type??
So, how did we decide if this population is novel epithelium or stroma?
Two main strategies:
1. We knew the literature had evidence for extremely rare (<0.1%) progenitor mammary populations related to tumor initiation. We found lots of similarities with those populations.
‼️ 2. We actually did experiments to validate the epithelial (and not stroma!) nature of these cells. We found rare cells in the breast with both epithelial markers, as well as part of this subpopulation.
These experiments were tough because of the rarity of these cells.
All in all, we gathered substantial evidence (both computational & ultimately experimental) that this population is indeed a novel epithelial type, and not just stroma.
Why did then *none* of the multiple tools we applied signal the novel nature of this subpopulation?
When thinking about it, it's actually pretty straight-forward why.
There's absolutely no magic: cell assignment tools need references to match cells to, and assign based on max similarity with references.
Obviously, with a novel cell type, there's no reference to match it to.
Still, in such situations (unmatched to references), most methods claim to at least flag novel subpopulations.
But how about intermediate populations transitioning between cellular types, with context-dependent roles?
How about subpopulations very similar to other cell types?
The truth is that, in such cases, cell type assignment tools will fail, almost by definition. This expected behavior shouldn't surprise us.
This is why expert/biological knowledge is necessary & has authority over any algorithm
Also, biological validation is the ultimate proof!
‼️ I want to make it explicit that I am not claiming cell type assignment algorithms are not performing well.
I think such algorithms are nothing short of extraordinary.
It's just that they can't do all the work for us.
We also need to understand the biology behind our data.
The reason I am bringing this up is that, in my experience, it comes up repeatedly during discussions over concrete #scRNAseq datasets.
I've seen many #Bioinformatics analysts somehow reluctant to "override"/question the assignments of an algorithm.
That is missing the point.
As scientists, every decision we take needs to be justified.
Once justified and backed-up by evidence, it is valid.
Once the point is valid, it doesn't matter if algorithm X or method Y say differently.
We only want one thing: our claim to be TRUE, to the best of our knowledge
‼️Finally, actionable:
Understanding how cell type assignment algorithms actually work helps us also understand their limitations.
That's why my advice to #Bioinformatics folks is to read the papers behind *all* the algorithms they are applying.
(Sorry, no exceptions allowed!)
• • •
Missing some Tweet in this thread? You can try to
force a refresh
1. Define a generative program consisting of a syntax tree & a set of hierarchical constraints 2. Compile the program in (1) into an energy function 3. Optimize the function via simulated annealing. The solutions are the artificial proteins.
Inspiring Symposium on Cancer Prevention @EACRnews
95% of cancer drugs fail. 94% do not improve life quality.
An ounce of prevention is worth a pound of cure. (B. Franklin)
Cancer prevention is tremendously difficult. But it is also necessary.
We need to shift our focus.
How to move from developing cancer treatments to cancer prevention? @cohen_cyrille
How to change the single gene/ single mutation paradigm for holistic approaches considering multi-omics, lifestyle, exposure and cells as a whole? @AzraRazaMD
How much does the environment matter? Can we prevent cancer by modulating exposure? @CBrisken
Which neoantigens to target? Shared or unique mutations? Overexpressed genes?
Will eliciting immune responses via vaccines help prevent tumors in high-risk populations? @emmyverschuren
If you are already familiar with Graph Neural Networks, but still want to better understand the maths behind in a formalized logical framework, I recommend the following book/paper by @mmbronstein@joanbruna@TacoCohen@PetarV_93
Division frenzy 🤩: T cells can divide indefinitely & long outlive their host organism!
One of 2023's most exciting papers so far!
A paper that challenges scientific paradigms & brings strong experimental evidence against long-held scientific beliefs.
Let's break it down🧵
Friends, this small 5-page @Nature paper is the perfect example of the ideal science:
1. Pick a very relevant topic (T cell adaptive immunity) 2. Ask a very relevant question related to this topic (how often can CD8+ T cells divide?)
👇 nature.com/articles/s4158…
3. Understand very well the current state of research (T cells have limited division potential) 4. Develop a hypothesis testing current state 5. Craft an accurate experiment to test it (passage same T cells for 10 years) 6. Investigate findings 7. Confirm/contradict hypothesis 🎁