In this paper we used REAL DATA from Military Physical Therapy clinics to show empirical issues with MCIDs on multiple levels.
Remember if science is telling you there's "one cool trick" to say if your data are meaningful, there's always a catch (usually, a lot of them)
3/17
The basic MCID process is: 1. Take difference between pre and post scores 2. Determine your "anchor", which needs to be a binary outcome of interest (or an interval one you've dichtomotized [UGH]) 3. Perform ROC Analysis...
4/17
4. Extract out change score which balances sensitivity and specificity (usually using Youden's J)
POOF! You have your magic number you want to achieve for "clinical relevance".
But wait...
5/17
You used a ROC, right? What was the Area Under the Curve? If your AUC is rubbish, then your MCID is rubbish.
Also, this is empirical data... right? Shouldn't there be confidence intervals around a point estimate? 🤔🤔
What about being biased by the baseline score?
6/17
Those 3 things are the entirety of this paper.
Basically, after you control for baseline score, you then need to have confidence intervals around BOTH your Area Under the Curve and the MCID to determine if you've got a valid MCID...
7/17
We concluded that if your AUC confidence intervals did not cross 0.5 (the noise line) then your MCID could potentially be statistically valid (e.g. better than noise)
PROMIS Pain Interference looks like this:
Small window in the upper-middle is potentially valid
8/17
After you look at AUC, you look at your MCID and those 95% confidence intervals. They're theoretically valid if the MCID and the intervals don't cross zero and are theoretically consistent.
Here's that image (explained in next tweet).
9/17
What do we mean by "theoretically consistent"?
Let's put it this way, would you really believe that a patient should have more pain interference in their life to reach a positive clinical outcome? Absolutely not, that's nonsense. That's all we're saying here.
10/17
So the AUC has to be better than noise and the MCID has to be theoretically consistent and those regions have to overlap. Where the MCID estimate is in BOTH of those windows, you've got a potentially valid MCID.
So you've got a conditional MCID...
11/17
You'd say something like, when your baseline score is between 50 and 75, we can calculate a valid MCID based on this graph. (see tweet number 9).
That's obviously a lot more complicated than a singular all-powerful MCID number most clinicians want.
12/17
Additionally, MCIDs have issues in that they treat clinical outcomes as univariate. Are you and your future life only dependent upon Pain Interference? Or Anxiety? or a Joint Range of Motion?
Certainly not. Humans are multidimensional and MCIDs don't capture that.
13/17
Moreover, many people get better or worse overtime. Clinical progression is anything but linear.
We can call these "state changes" and MCIDs don't really capture that dynamic and aren't designed to!
So what can we conclude? Well...
14/17
Current MCIDs are likely to be invalid, at least on some level based in either baseline bias, statistical validity, or theoretical consistency.
We've got a method to do MCIDs "better" but it's more complicated and still univariate.
15/17
I think there are a lot better ways to go about solving these problems.
First, model the multidimensional nature of patients with lots of data (multivariable analyses).
Second, look into state-transition models or other higher-level models that reflect reality...
16/17
So, yeah. We provocatively conclude that "the MCID may be too flawed a construct to accurately benchmark treatment outcomes".
The basic MCID may be simple, but it's wrong. There are better options, we just need to work towards them.
17/FIN
• • •
Missing some Tweet in this thread? You can try to
force a refresh
As they say, "if you come at me, you better not miss".
I've published a couple of papers in both @JAT_NATA and @JOSPT where we were critical of existing Minimal Clinically Important Difference (MCID) metrics
One suggestion was to use a baseline adjusted ROC... Terlin disagreed.
Here are the JAT and JOSPT papers. One of which was a full simulation, the other was with empirical data:
Thanks to @erikMeira for the endorsement as an Athletic Trainer to follow on Twitter.
I'll take this opportunity to demonstrate to my Sports Med colleagues why you shouldn't trust the recently @US_FDA Authorized Q-Collar to protect against TBI
I will demonstrate that the key study the @US_FDA cites for their authorization is a case study in scientific obfuscation, error inflation and (if we want to get real accusatory) p-value hacking.
1. Their main modality is DTI or Tractography. This looks at structural white matter pathways in the brain and how they might be disrupted by examining water diffusion.
This is one of those few papers I've written that I've felt was really important. I'll try and walk through the problems & remediations we identified with the widely-used Minimal Clinically Important Difference (MCID) measures used in Sports Medicine
*THREAD*
1/20
This paper came about because, in my time working with @MOTIONetwork, I am often asked by some of our leading physicians (@AndrewSheeanMD@jondickensmd) to derive MCID metrics for our Patient Reported Outcomes.
Basically, they want to know how much change is meaningful.
2/20
How much change is clinically relevant should ALWAYS be a consideration. The MCID is based off of the Receiver-Operator curve analysis, where you try to identify the change (delta score) which reflects the optimal balance between sensitivity and specificity.
Exercise & Sport Sciences has a poor track record developing "Novel Statistical Methods" & a new one was recently published: link.springer.com/article/10.100…
As a reviewer on this paper now working to have it retracted, here is my perspective on how these methods get popularized
[THREAD]
First, a table of contents: 1. Invited to review the manuscript & suggest rejection 2. Contacted by outside researchers because concerns a flawed paper was published in @SportsMedicineJ where I am on the EB 3. I contact EiC to determine why the flawed paper was published
...
2/23
4. Develop simulations and formal mathematics providing evidence that the published method is deeply flawed 5. Contact the Authors, Dankel & Loenneke, regarding our work asking if our simulations/math are wrong and for their evidence
...
3/23
I apparently have a bit of a reputation as someone who is anti-machine learning or anti-AI when it comes to human research. This is a bit of misrepresentation of my views, and (I'd argue) a misrepresentation of the issues "statistics people" take with AI/ML as a whole.
(THREAD)
I personally think that AI/ML has a lot to bring to the table to enhance science, health and human performance. The problem is that the AI/ML crowd are over-selling their wares and often being disingenuous about what is current state-of-the art
2/10
Issue 1: CLAIMING EVERYTHING IS MACHINE LEARNING.
Just because AI/ML may use algebra or linear regression, doesn't mean it is AI/ML. Same goes for Nonlinear regression, correlation, logistic regression, or everything else that IS STATISTICS (or information theory, etc.)
I've now had time to watch and read this interesting presentation by @LukeBornn@OSPpatrick@DarcyNorman at @SloanSportsConf and have had time to collect my thoughts. Here's my thread and/or rant... depending on your perspective.
First, this is a great team of people with an expertise to tackle this problem. I actually think it's funny that the team as a whole mirrors my own background (statistician, athletic trainer & exercise scientist). Great team effort.
2/8
Second, with a few exceptions related to inelegant word choice about statistical significance and path diagrams, I agree that the authors have taken a reasonable approach to examining acute:chronic workload to predict injury/causal effects.
3/8