Profile picture
Eiko Fried @EikoFried
, 18 tweets, 11 min read Read on Twitter
1/ Why depression is not a useful or reasonable phenotype for research in clinical psychology, psychiatry, or medicine: a summary of evidence in 18 tweets.

Sx = symptoms, MDD = major depressive disorder.
2/ There are ~280 scales to assess MDD severity (tandfonline.com/doi/abs/10.120…). Scales are often only weakly inter-correlated, and differ widely in symptom content. The 7 most commonly used rating scales encompass > 50 disparate Sx (ncbi.nlm.nih.gov/pubmed/27792962).
3/ This means that MDD is a very ill-defined phenotype, with little agreement on how to measure it (imagine this amount of disagreement for e.g. cancer or HIV). The DSM-5 defines 9 Sx, but these were chosen based on subjective, not psychometric reasons (biomedcentral.com/1741-7015/13/72).
4/ DSM MDD criteria fail to assess important Sx such as anger & anxiety that predict future outcomes for patients (ncbi.nlm.nih.gov/pubmed/24026579; ncbi.nlm.nih.gov/pubmed/18172020). See fantastic series of papers by Mark Zimmerman (“Diagnosing major depressive disorder”) in J Nerv Ment Dis.
5/ Ken Kendler has written about the topic as well (ajp.psychiatryonline.org/doi/10.1176/ap…). And we have shown that DSM Sx do not outperform non-DSM Sx psychometrically (they are not more inter-correlated; test statistics in respective papers: linkinghub.elsevier.com/retrieve/pii/S…; linkinghub.elsevier.com/retrieve/pii/S…).
6/ MDD has low reliability. DSM-5 field trials showed that MDD is one of the least reliable diagnoses, with inter-rater reliability of 0.28 (borderline personality disorder: 0.54) (ncbi.nlm.nih.gov/pubmed/23111466). This isn’t “questionable” (screenshot), it’s “barely above chance level”.
7/ Most research add Sx to sum score which is then used as predictor, outcome, moderator. But sum scores for different scales mean different things since based on different Sx → poses threat to replicability/generalizability of depression research (dx.doi.org/10.1016/j.jad.…).
8/ MDD is highly heterogeneous. 227 unique Sx profiles (SP) exist for MDD diagnosis using DSM-5 criteria. When disaggregating the 9 criteria (e.g. insomnia vs hypersomnia), 16400 SP emerge. Patients differ dramatically in SP.
(ncbi.nlm.nih.gov/pubmed/25451401; ncbi.nlm.nih.gov/pubmed/24886017).
9/ This means that a sum score of 18 points on e.g. BDI scale might mean something fundamentally different for patients Susie & Tom, depending on the particular Sx they have (insomnia and suicidal ideation are *not* interchangeable, neither clinically nor psychometrically).
10/ MDD sum scores add up not only very diverse Sx, but literal opposites (e.g. insomnia vs hypersomnia; weight gain vs weight loss; appetite again vs loss; psychomotor agitation vs retardation). From a measurement perspective, this is pretty stunning.
11/ MDD sum scores also add together items that are weakly correlated or even uncorrelated, since common rating scales in patient samples not unidimensional. Again, from measurement perspective, adding uncorrelated items is pretty darn crazy (psycnet.apa.org/record/2016-04…).
12/ Below a simple visualization of the correlation matrix of IDS-30 Sx in n~3500 STAR*D data, without (left) and with (right) thresholding at correlation = 0.15. Many weak correlations.
13/ MDD as a category is another issue: sum scores in community & clinical samples have (skewed) normal distributions, not bimodal. Research comparing ‘healthy’ vs ‘depressed’ groups by splitting at peak of distribution (mean/median) is problematic in terms of subgroup validity.
14/ Another problem of sum scores: specific MDD Sx differ in important properties (e.g. risk factors, impact on impairment, biomarkers, etc). I wrote my PhD on this. For a review, see: bmcmedicine.biomedcentral.com/articles/10.11…
15/ Here one empirical example: specific MDD Sx are highly impairing, others are not (unique R^2 of Sx on impairment). So Sx *nature*, not Sx *severity* predicts impairment in patients (dx.plos.org/10.1371/journa…). Right side: impact of Sx differs for specific impairment domains.
16/ Conclusion: MDD is not a useful phenotype for scientific research. Great summary by Gordon Parker in ncbi.nlm.nih.gov/pubmed/15856717. Screenshot from frontiersin.org/articles/10.33…:
17/ GWAS studies have all of the above problems.

a) aggregate data from different sources that used different scales w different Sx
b) add Sx to sum
c) split on arbitrary criteria / dataset
d) correlate binary diagnosis to SNPs

Destroys nearly all information in the process.
18/18 How to move forward?

Analyze individual symptoms & use data-driven approaches to inform development of DSM-6 (frontiersin.org/articles/10.33…; tandfonline.com/doi/full/10.10…).

The end.
Missing some Tweet in this thread?
You can try to force a refresh.

Like this thread? Get email updates or save it to PDF!

Subscribe to Eiko Fried
Profile picture

Get real-time email alerts when new unrolls are available from this author!

This content may be removed anytime!

Twitter may remove this content at anytime, convert it as a PDF, save and print for later use!

Try unrolling a thread yourself!

how to unroll video

1) Follow Thread Reader App on Twitter so you can easily mention us!

2) Go to a Twitter thread (series of Tweets by the same owner) and mention us with a keyword "unroll" @threadreaderapp unroll

You can practice here first or read more on our help page!

Did Thread Reader help you today?

Support us! We are indie developers!


This site is made by just three indie developers on a laptop doing marketing, support and development! Read more about the story.

Become a Premium Member and get exclusive features!

Premium member ($3.00/month or $30.00/year)

Too expensive? Make a small donation by buying us coffee ($5) or help with server cost ($10)

Donate via Paypal Become our Patreon

Thank you for your support!