New paper from @joans and me! A pan-cancer, cross-platform analysis identifies >100,000 genomic biomarkers for cancer outcomes. Plus, a website to explore the data (survival.cshl.edu) and a (controversial?) discussion of “cause” vs. “correlation” in cancer genome analysis.
We used every type of data collected by TCGA (RNASeq, CNAs, methylation, mutation, protein expression, and miRNASeq) to generate survival models for each individual gene across 10,884 cancer patients. In total, we produced more than 3,000,000 Cox models for 33 cancer types.
Within each cancer type, we identified thousands of biomarkers for favorable and dismal patient outcomes. The most common adverse biomarkers included overexpression of the mitotic kinase PLK1, methylation of the transcription factor HOXD12, and mutations in TP53.
GO term analysis revealed common gene groups among adverse and favorable biomarkers, including cell cycle genes (upregulated in deadly cancers) and developmental transcription factors (methylated in deadly cancers).
We could use these biomarkers to stratify patient outcomes in clinically-ambiguous situations, including Stage 1a breast cancer and Gleason 7 prostate cancer. In general, gene expression and DNA methylation biomarkers provided the most prognostic information.
So now here’s where it gets weird: aside from mutations in TP53, we didn’t see many cancer driver genes score as strong biomarkers in our prognostic analysis. KRAS, EGFR, RB1, PIK3CA, RB1, NF1… mutation, methylation, or altered expression of these genes wasn’t really prognostic.
In the literature, if some gene is associated with worse cancer outcomes, then that is typically presented as evidence that that gene is an important cancer driver. But clearly KRAS and PIK3CA are important cancer drivers and they didn’t score in our analysis… so what gives?
To investigate this, we analyzed lists of cancer driver genes, and then we compared their prognostic significance to randomly-permuted gene sets. Surprisingly, verified oncogenes were no more likely to be prognostic than any randomly chosen gene in the genome!
For instance - KRAS mutations clearly drive lung cancer. But KRAS mutations in lung cancer are *not* associated with worse patient outcomes. In some cases, mutations in specific oncogenes are associated with *better* outcomes, not worse outcomes.
If you infer the importance of a gene from survival analysis (which is exceptionally common in the literature, and is something I’ve previously done myself) - you could accidentally conclude that CENPA is a more important driver of prostate cancer progression than MYC:
In general, our analysis provides genome-wide evidence that inferring *causation* (gene A is a driver of cancer progression) from *correlation* (gene A is overexpressed in deadly cancers) is not appropriate for patient outcome analysis, even if it’s commonly done.
Next, we looked at cancer drug targets. Again, it is routine to see the fact that a gene is associated with deadly cancers presented as evidence that that gene is a good drug target. But is this link justified by the data?
We looked at the targets of all FDA-approved cancer drugs, and we found that these drug targets were no more likely to be prognostic than any randomly-selected gene in the genome!
Consider PD1 as a drug target. High levels of PD1 (PDCD1) are associated with patient survival. So you might think that PD1 inhibitors would kill people! But cancers don’t work like that - survival correlation is not causation - and PD1 inhibitors in fact prolong survival.
(You could imagine that this is a type of post-hoc fallacy - maybe these genes are non-prognostic because of the existing therapies. But we did a sub-analysis on drugs approved after 2017 [post-TCGA], and we observed the same pattern).
Then we asked - what happens if you target the worst adverse features in the genome? Maybe those are still the best drug targets? Among the top 50 prognostic factors in the genome, we found that 16 have been targeted in clinical trials, and 15 of them have failed.
We believe this is because the most prognostic factors are not selective oncogenes. They’re housekeeping cell cycle genes that are ubiquitously expressed, and they’re essential across cell types. No cell type-selectivity = systemic toxicity and trial failure.
Successful cancer drug targets may be adverse biomarkers, favorable biomarkers, or they may have no survival correlation whatsoever. Our data demonstrates that this type of prognostic analysis should be uncoupled from therapeutic target development.
To put this in perspective - imagine a KM plot of 10,000 senior citizens: “people receiving dialysis” vs “people not receiving dialysis”. Individuals receiving kidney dialysis are more likely to die than individuals who are not receiving dialysis...
Based strictly on this correlative observation, one could assume that kidney dialysis kills people! Yet, we know that people receiving dialysis are likely to be older and have several medical comorbidities, and dialysis saves their lives. Same thing in cancer genomics!
Inferring functional relationships and prioritizing drug targets based on correlative outcomes analysis may be inappropriate, as these relationships can be fraught with confounding variables and spurious associations.
So, let me know what you think, and take a look at our website - survival.cshl.edu. 3 million Kaplan-Meier plots to explore and lots more exciting findings to uncover. Feedback welcome!
I should add - I was playing around with some of the ideas in the paper in the thread linked below. It goes a little deeper into the drug target analysis and the misinterpretation of what survival curves mean:

• • •

Missing some Tweet in this thread? You can try to force a refresh
 

Keep Current with Jason Sheltzer

Jason Sheltzer Profile picture

Stay in touch and get notified when new unrolls are available from this author!

Read all threads

This Thread may be Removed Anytime!

PDF

Twitter may remove this content at anytime! Save it as PDF for later use!

Try unrolling a thread yourself!

how to unroll video
  1. Follow @ThreadReaderApp to mention us!

  2. From a Twitter thread mention us with a keyword "unroll"
@threadreaderapp unroll

Practice here first or read more on our help page!

More from @JSheltzer

24 Jan
In a blinded name-swap experiment, black female high school students were significantly less likely to be recommended for AP Calculus compared to other students with identical academic credentials. Important new paper from @DaniaFrancis:

smith.edu/sites/default/… Image
Some background: one of the best ways to collect real-world evidence of discrimination is through name-swapping "audit" studies. In these experiments, people are presented with job applications, resumes, mortgage applications, etc., that are identical except for the name…
The applicant’s name is varied to suggest the individual’s race/ethnicity/gender. Think “John” vs “Juan” or “Michael” vs. “Michelle”.
Read 11 tweets
29 Oct 20
Angelika Amon passed away this morning. She was the greatest scientist I’ve ever met. This is a huge loss for her family, her friends, and for every biologist.
As a grad student with Kim Nasmyth and then an independent fellow at the Whitehead, Angelika changed our understanding of the cell cycle.
People thought that cell cycle kinases just got degraded at the end of mitosis, but she showed that regulated phosphatase activity was actually crucial to completing the cell cycle and re-entering G1:
Read 15 tweets
23 Sep 20
In two weeks, the Nobel Committee at the Karolinska Institute will award the 2020 Nobel Prize in Medicine/Physiology.

Who will win? We don’t know for sure - but I think that we can make some educated guesses.
Science is dominated by a phenomenon called “the Matthew effect”. In short, the rich get richer. Getting one grant makes it more likely you’ll get the next. Winning one prize makes it more likely you’ll win another.

en.wikipedia.org/wiki/Matthew_e…
I looked back at the last 20 years of Nobel Prizes in Med/Phys.

83% of them had won at least one of three prizes before the Nobel: the Lasker, the Gairdner, or the Horwitz Prize.
Read 12 tweets
16 Sep 20
Here are the award rates for 11 different postdoc fellowships in 2019.

There’s a huge variation in success rates: four different organizations fund fewer than 6% of applications that they receive, while the success rates for the K99 and F32 are >24%. Image
To back up - my appointment at CSHL let me run a lab without doing a postdoc, so I never had the experience of applying for these grants. To help out my current postdocs, I wanted to make up for my lack of experience by doing some research.
I collected the award rates for each of these grants either from the org’s website or by emailing them directly. (I included an asterisk to indicate uncertainty. For instance, Beckman said they received “over” 150 applications, and I used 150 as the denominator).
Read 5 tweets
1 Sep 20
Question: can anyone name a paper whose findings were challenged by a “matters arising” or “technical comment”-type rebuttal, but subsequent research proved that the original paper was actually correct?
One example: Charles Sawyers published that leukemia patients who relapsed on Gleevec developed ABL-T315I mutations.

Science then published 2 technical comments reporting that other groups didn't find this mutation in independent patient populations:

science.sciencemag.org/content/293/55…
Larger surveys subsequently confirmed that T315I was a common (though not universal) cause of Gleevec resistance, T315I became the paradigmatic example of a “gatekeeper” resistance mutation, and Sawyers won the Lasker prize.

pubmed.ncbi.nlm.nih.gov/21732333/
Read 4 tweets
26 Aug 20
What happens to a paper submitted to a top journal?

Among a set of manuscripts sent out for review by Cell in 2018:

-33% were published in Cell
-26% were published in another Cell-family journal
-7% are still under review at Cell
-The median time to publication was 391 days
To back up: in 2018, Cell started the “Sneak Peek” program, in which authors had the option of posting a preprint of their manuscript if it was sent out for review by a Cell-family journal. cell.com/sneakpeek
Using this site, I found 46 papers that were sent out for review at Cell and posted on “Sneak Peek” between June 1st and Dec 31st, 2018. Each paper’s current status was also noted: “Published”, “Under review”, or “Review Complete” (a nice euphemism for “rejected”).
Read 14 tweets

Did Thread Reader help you today?

Support us! We are indie developers!


This site is made by just two indie developers on a laptop doing marketing, support and development! Read more about the story.

Become a Premium Member ($3/month or $30/year) and get exclusive features!

Become Premium

Too expensive? Make a small donation by buying us coffee ($5) or help with server cost ($10)

Donate via Paypal Become our Patreon

Thank you for your support!

Follow Us on Twitter!

:(