Happy to share our new preprint with Edwin de Beurs, in which we recommend to solve the current dilemma "So-Many-Scales-For-The-Same-Construct" (e.g. for depression) by mandating a common metric, not by mandating a common measure.🧵
We introduce the problem of scale proliferation, and how it impacts not only science, but also communication (between researchers & policy makers; between clinicians; between clinicians & clients; etc).
In our new piece, Edwin & I instead suggest common metrics similar to IQ scores: independent of the specific instrument, and simplify communication.
We introduce several metrics, and focus on T-Scores & Percentile Ranks, discussing their calculation, merits & shortcomings.
We provide empirical examples using 3 depression questionnaires to demonstrate the current confusion, and the benefits and utility of common metrics.
We conclude that T-scores & percentile ranks may aid measure harmonization & interpretation of test results, enhance the communication about tests results among professionals, and ease explaining their meaning to clients.
Curious to hear your thoughts! End 🧵
PS: Massive shoutouts to @pravpatalay whose previous work on this was a big inspiration for Edwin & me to think about common metrics, & to @mirandarwolpert for starting a public debate on this issue in clinical psych!
PS2: I expect some pushback on this proposal from a content validity perspective. Would be really interested in your thoughts on this. I.e. can you harmonize X Y and Z if they aim to measure the same thing, but they don't actually do so?
Lots of responses already, thanks so much! My view: Standard scores allow us to see how *exceptional* a test result is, independent of the content it measures. E.g. you can compare how exceptional someone is on 3 IQ subtests, even if these subtests are uncorrelated.
This allows you to compare the result of scales assessing different constructs, or scales aiming (but failing) to assess the same. What we gain from comparing Standard scores of personality scale vs PTSD scale isn't clear to me, but of course it can be done.
But facing the reality of current measure chaos, there seems a lot to be gained from comparing T-Scores of mental health measures, especially considering that they are usually moderately correlated with each other and impairment; and that psychopathology is transdiagnostic.
Some methodological ways to address this were proposed in the thread. I'll be looking into these in some detail and see if I can provide a response. Maybe the paper would also lend itself to a few commentaries, Gary already proposed one: "jingle-jangle on steroids" ;)
• • •
Missing some Tweet in this thread? You can try to
force a refresh
Beavers, like all species, are convenient fiction. Thinking of beavers as true category in nature is pre-Darwinian.
Beavers, instead, are a number of animals that cluster together quite closely in an n-dimensional space on a large number of features.
Outrageous? Bear with me.🧵
What are these features? They include things like length, hairiness, intelligence, number of limbs, distance of ears to claws, and so on.
These features cluster together because they are causally related (often in complex ways).
Imagine this 2-feature plot, except with 3.7 billion features.
You can see that many elephants and many beavers clusters on 2 features. You can also see an outlier elephant (lots of hair) and an outlier beaver (exceptionally heavy).
What do y'all think about APS' decision to offer 15min flash talks (1 person) rather than symposia in 2021 virtual conference? Bit sad that (online) panel discussions, symposia followed by discussions, etc are skipped. Always found such interactions btw folks most engaging.
But have no experience in conference orga, so I'm sure there are good reasons.
To me, looks like youtube would do a better job because 1) youtube comes without conference fee, and 2) presentations would be #openaccess rather than behind APS paywall.
What am I missing? Thx!
In case all talks open, there is genuine value to participate in APS of course ($ then is for talks to be organized, vetted, grouped, etc).
In case talks paywalled, curious how that can be policed (can hardly forbid folks to upload talks into general OSF repository)
New paper "Lack of theory building and testing impedes progress in the factor and network literature" is in press at Psych Inquiry. This took a while to write—the first draft dates back to 2016.
The most exciting aspect is that Psych Inquiry will publish many critical commentaries, including from the very people whose work inspired me to write the paper in the first place. Some of them, incl. @IrisVanRooij & @psmaldino, are even listed in the ack section of the paper.
Here is Iris' preprint, together with @giosuebaggio:
Our paper on measuring outcomes that matter to depressed patients, caregivers, & healthcare professionals is out, led by the brilliant 🔥@ChevanceAstrid🔥.
Details in 🧵below. If you RT only one of my tweets this year, make it this one.
1/ Clinical studies on depression assess symptoms (e.g. sad mood), but there are no proper standards on what symptoms to measure. A recent meta-analysis on psychotherapeutic interventions (200 studies) identified 33 different outcomes used.
2/ Further, symptoms are not a good proxy for how people are doing; e.g. recovery of functioning often lags half a year behind symptom recovery. Finally, it's 2020, yet it is unclear what outcomes we should measure to fully capture people's lived experiences with depression.
However, standardization only helps us compare samples if measures are what we called "measurement invariant", i.e. if they measure the same thing across samples. I am not fully convinced the chosen scales do that.
Which leads me to the biggest ⛔️ of this decision:
Many good reasons to choose these particular scales, such as brevity, but one of the most important reasons from a measurement perspective is validity evidence: do they measure what they claim to measure. And I'm not convinced this is the case. Table below by @JkayFlake 2017.
For those who cannot join my talk on lack of theory in psychology today (2pm CEST, stream pmltalks.nl), here a summary in 6 comics, adapted from the brilliant original by @MrLovenstein.
2/6 Adapted to fit the talk, which is about mismatch between data and theory.
3/6
With most psychological theories, the most benign situation we can find ourselves in is one where we acknowledge that our data are not actually informative about our theory.