NonsparseOncologist Profile picture
May 7 13 tweets 3 min read Read on X
🧵 A new Nature paper on “spatial ecotypes” claims liquid biopsy can predict immunotherapy response. The tumor biology is cool. The statistics are a masterclass in how to do bad biomarker research. @f2harrell has a checklist for this. Let’s go. 1/
First, Harrell’s Biomarker Uncertainty Principle: “A molecular signature derived from high-dimensional data can be either parsimonious or predictive, but not both.” This paper tries to be both. It earns neither. 2/fharrell.com/post/badb
DICHOTOMANIA. They split 78 patients at the MEDIAN to draw KM curves. Harrell’s BBR shows median dichotomization reduces efficiency to 63.7% vs. continuous analysis — you need 158 patients to match what 100 give you. They burned 36% of their tiny cohort 3/
“Virtually all published cutpoints are analysis artifacts.” . The median isn’t biology. Add patients to one tail and the threshold moves. The tumor didn’t change. Your KM curves did. 4/hbiostat.org/bbr/info
Natura non facit saltus. Nature doesn’t make jumps. The median cutpoint assumes patients just above and just below are maximally different, while patients just above and far above are identical. This is not how SE levels work. Not how tumors work. Not how anything works. 5/
The “winning biomarker” problem. @f2harrell bootstrap example: the apparent winner of 213 markers had a 95% rank CI of [30–213]. Data could only rule out it being in the bottom 29. SE8 “wins” across 41 features here.

Nobody ran the bootstrap. Nobody showed those rank CIs. 6/
@f2harrell Harrell checklist item: “Pick a winning biomarker even though tiny changes in the sample result in a different winner.” SE8 beats SE7 overall. SE7 beats SE8 in melanoma alone. That’s not biology. That’s rank instability.! 7/fharrell.com/post/badb
@f2harrell Harrell checklist: “Validate using a sample too small or that should have been in training data.” Paired plasma-tumor validation: n=23. No CIs on the Spearman correlations shown. With n=23 those CIs span from “meaningless” to “strong.” That’s not a result. That’s a range. 8/
@f2harrell BBR chapter 18: dichotomizing a continuous outcome can require 5x the sample size for equivalent power. This paper dichotomizes the predictor (high/low SE) AND the outcome (responder/non-responder). Double dichotomania. 9/
@f2harrell Harrell: “Touting a new biomarker while ignoring basic clinical info that may be more predictive.” TCGA survival models here adjust for age and sex only. No stage. No performance status. SE7 beats TMB! Did it beat a properly specified clinical model? We will never know. 10/
@f2harrell Liquid EcoTyper trained on SIMULATED cfDNA — clean math mixtures of tumor DNA + healthy plasma. Harrell: “Live within the confines of the information content of the data.” A simulation’s info content is not real plasma’s. Real plasma has biology that hates your assumptions. 11/
@f2harrell The abstract: “implications for improved risk stratification and therapy personalization.” The data: n=78, median split, simulated training, no calibration curves, no PH check, no rank CIs, adjusts for age/sex only. One of these is a Nature paper. The other is the truth. 12/
@f2harrell Harrell’s bad biomarker checklist: don’t have a stat analysis plan; categorize continuous variables; overstate predictive utility; avoid checking absolute accuracy. This paper hits them all. 13/13 /end

• • •

Missing some Tweet in this thread? You can try to force a refresh
 

Keep Current with NonsparseOncologist

NonsparseOncologist Profile picture

Stay in touch and get notified when new unrolls are available from this author!

Read all threads

This Thread may be Removed Anytime!

PDF

Twitter may remove this content at anytime! Save it as PDF for later use!

Try unrolling a thread yourself!

how to unroll video
  1. Follow @ThreadReaderApp to mention us!

  2. From a Twitter thread mention us with a keyword "unroll"
@threadreaderapp unroll

Practice here first or read more on our help page!

More from @5_utr

May 1
❗️ Hot take Friday: Precision oncology is the biggest narrative scam in cancer medicine. “Find the right gene and we’ll cure cancer.” We’ve been hearing this for 30 years. Meanwhile, surgery and radiation are still the only things that actually cure solid tumors. Let’s go 🧵
New Nature paper just dropped tracking lung cancer evolution from dx to death across every met. 79% of mets had unique subclones not found anywhere else. There is no “the mutation.” There are thousands of mutations, evolving independently, in real time.
nature.com/articles/s4158…
62.5% of patients had MULTIPLE primary subclones seed DIFFERENT metastases. Each met then seeded other mets.

The notion of precisely targeting a driver is largely based on completely false premises
Read 11 tweets
Jan 3
🚨 Why treat Stage III NSCLC with a chainsaw when you can use a sniper rifle? 🛑 Bayesian meta-regression of 20 years of multiple RCTs confirms: the era of surgery has been superseded by non-invasive precision 1/N
Utilizing a hierarchical Bayesian framework to aggregate longitudinal Phase III outcomes (2007–2026), the posterior distribution reveals a 62.6% probability of overall survival superiority for definitive CRT over surgical resection 2/N Image
This model incorporates the full heterogenous spectrum of historical and contemporary trials:
🔹 INT 0139 & EORTC 08941
🔹 ESPATUE
🔹 PACIFIC (The CRT-IO Paradigm)
🔹 CM 816, KN-671, & AEGEAN
Read 6 tweets
Nov 21, 2025
🧵 For various lung dose-volume constraints, which are highly correlated, what does v5, v10, mean lung dose, add vs v20? Here we see spearman matrix expectedly highly correlated 1/n Image
Very importantly, let’s model v5, v10, v20 and MLD using flexible RCS, which also creates a lot of correlated basis expansions 2/n Image
Let’s look at the Fisher information matrix as a heatmap to visualize the expected curvature of the log-likelihood with respect to all model parameters 3/n Image
Read 9 tweets
Nov 15, 2025
🧵 How do we decide if SBRT or surgery is better for early-stage lung cancer patients when factoring survival and quality of life? Let’s visualize it with Monte Carlo simulations from STARS/ROSEL data. 🫁 1/n Image
Weighted utility = 2× survival + QoL.
SBRT has higher survival → utilities cluster higher than surgery.
This weighting drives most patients above the scipy.optimize threshold Image
Plotting the CDF of patient level absolute loss is consistent with probability of worse losses with surgery than SBRT is 97% Image
Read 6 tweets

Did Thread Reader help you today?

Support us! We are indie developers!


This site is made by just two indie developers on a laptop doing marketing, support and development! Read more about the story.

Become a Premium Member ($3/month or $30/year) and get exclusive features!

Become Premium

Don't want to be a Premium member but still want to support us?

Make a small donation by buying us coffee ($5) or help with server cost ($10)

Donate via Paypal

Or Donate anonymously using crypto!

Ethereum

0xfe58350B80634f60Fa6Dc149a72b4DFbc17D341E copy

Bitcoin

3ATGMxNzCUFzxpMCHL5sWSt4DVtS8UqXpi copy

Thank you for your support!

Follow Us!

:(