Tweet

Dr. Michael Mullarkey

1 Oct, 23 tweets, 6 min read

If you ever want to sound like an expert without paying attention, you only need two words in response to any question

"It depends"

A thread on why we should retire that two word answer 🧵

When people say "it depends" they often mean the effect of one variable depends on the level of at least one other variable

For example:
You: Does this program improve depression?
Me, Fancy Expert: Well, it depends, probably on how depressed people were before the program

Understandably you'll want some evidence for my "it depends"

Luckily my underpaid RA has already fired up an ANOVA or regression, and *I* found that how depressed folks were before the program moderated the effect of the program

"It depends" wins again?

Nope, so many problems

https://twitter.com/mcmullarkey/status/1443666896763633667?s=20

1. Correlation still isn't automatically causation

Unless you've experimentally manipulated the moderator, you have a lot of work to do before you can claim that a moderator is causal

I recently did a thread on some bare minimums, and they're TOUGH

https://twitter.com/mcmullarkey/status/1443666896763633667?s=20

Ok, but now using my expert-ness I've smartly retreated to claiming my "it depends" isn't causal. It's predictive, descriptive, or whatever our audience will find most convincing. We're good now, right?

Not even close

2. "It depends" is too vague to be useful

You realize I, fancy expert, am on the ropes. You ask "Is your 'it depends' about slopes or correlations?"

I sputter until you show everyone the "Same Slope Does Not Imply Same Correlation" section of this paper
journals.sagepub.com/doi/full/10.11…

@dingding_peng

The technical details in that (excellent and thread motivating!) paper by @dingding_peng and @rubenarslan are super interesting, and I recommend you read

But even if you don't, the bottom line is: Even if "it depends" is technically true we have miles to go before it's useful

But I didn't get tenure for nothing, I've now formulated my "it depends" into a tight, formalized theory. There's equations and everything!

To defend my fancy expert self, I've now put in way more work than most folks do. And it's still not enough

3. Our samples likely aren't large enough to reliably detect moderations with realistic effect sizes

I've beat this drum many times before, so I'll be short. We're often underpowered for main effects and for interactions we need even better power
statmodeling.stat.columbia.edu/2018/03/15/nee…

@EmorieBeck

This bears out in large scale analyses, where interaction effects with personality variables were less replicable than main effects

(Shout out to @EmorieBeck for this super impressive mega-analysis!)

proquest.com/openview/82e1c…

But my expert-ness protects me again! I've gotten BIG GRANT EVERYONE WANTS and hired a data scientist to make sure I'm properly powered. Will you leave me alone now?

Lol no

4. We use the wrong tools

We're measuring everything on Likert (1-5, 1-7, etc.) scales. But since our stats training makes no sense we've been using linear regressions for continuous data the whole time

We need an ordinal regression instead, and when we use it this happens

Screenshots from this paper again!

journals.sagepub.com/doi/full/10.11…

Ok, ok, but now it's like 6 years later and I've run an appropriate model in an enormous sample. The interaction is still there! Sure, the moderation might or might not be causal, but vindication for the expert!?

I mean, maybe

5. We care about practical effects, not statistical ones

The original "it depends" has implied gravity behind it. We might even select who gets the program based on how depressed they are!

Except using single variables to do that is a fool's errand
ajp.psychiatryonline.org/doi/full/10.11…

There's also at least some evidence that single moderator variables have smaller effect sizes than main effects

psyarxiv.com/c65wm/

If we want to make any practical progress in identifying "what works for whom?" or any moderator-centered question, I (real me) think we have to:
- Embrace humility
- Run appropriate models that can account for many higher-order interactions at once

doi.org/10.1080/153744…

Ok, my fancy expert alter-ego is nearing the end of my illustrious career now. I've folded my initial one variable moderation idea into a machine learning model with lots of variables now!

Science loved it, so we must finally be done?

I wish

6. Even once we have multivariate prediction models, we have to directly test whether using those models actually improves outcomes we care about

For example, do people assigned to programs via our model actually get better more than folks randomly assigned to those programs?

"It depends" is almost always too vague, poorly tested, and lacking in practical relevance to be a useful statement, even if it's technically true

What should we say instead?

I don't have a great answer! Some variations I've tried:

"I'm not sure, what kinds of variables might be relevant?"

"Hmm, how could we design a study/run a model that helps us get closer to knowing who benefits more?"

"I don't know"

Also, let's be real, "it depends" could be ok if it weren't so often said with a tut or a smirk

Knowing the world is complicated is no substitute for using the methods necessary to grapple with that complexity

So the next time we're tempted to just say "it depends" to look like an expert, let's all work together to go beyond that surface level response

I bet y'all will come up with some great alternative responses, and I look forward to learning what they are!

• • •

Missing some Tweet in this thread? You can try to force a refresh

This Thread may be Removed Anytime!

Twitter may remove this content at anytime! Save it as PDF for later use!

More from @mcmullarkey

Dr. Michael Mullarkey

@mcmullarkey

30 Sep

Figuring out what causes what is SO HARD

And especially if you have a psych background, you might think we *need* an experiment to understand causes

While I love experiments, here's a thread of resources on why they're neither necessary nor sufficient to determine causes 🧵

@MP_Grosz

This paper led by @MP_Grosz is a great start! It persuaded me that merely adjusting our language (eg saying "age is positively associated with happiness" instead of "happiness increases with age") isn't enough

journals.sagepub.com/doi/full/10.11…

@dingding_peng

If our underlying research question is causal, we still need causal methods! But if they're not just experiments, what are the options?

Luckily for us @dingding_peng has a must-read primer on using causal methods with non-experimental data

journals.sagepub.com/doi/10.1177/25…

Read 13 tweets

Dr. Michael Mullarkey

@mcmullarkey

29 Sep

https://twitter.com/ajshackman/status/1442884353135157261

If we prioritized improving patients' and trainees' lives clinical psych's structures would look entirely different

A part touched on but (understandably!) not emphasized in this piece: There's vanishingly little evidence our training improves clinical outcomes for patients
🧵

https://twitter.com/ajshackman/status/1442884353135157261

Multiple studies with thousands of patients (though only 23-39 supervisors each!) show that supervisors share less than 1% of the variance in patient outcome

And that's just correlation, the causal estimate could be much smaller

tandfonline.com/doi/full/10.10…

journals.sagepub.com/doi/full/10.11…

There's evidence supervisors and trainees care more about a supervisors' "relational characteristics" than their "transmission of clinical know how"

It's ok to want to spend time with people we like, and there's no guarantee that will help patients

ncbi.nlm.nih.gov/pmc/articles/P…

Read 14 tweets

Dr. Michael Mullarkey

@mcmullarkey

27 Sep

Where should folks turn if they want mental health support for depression *right now* and aren't in crisis?

Traditional talk therapy often has long waitlists

The therapy apps you've heard about promising quick access to treatment have lots of problems

What I recommend 🧵

Adults Part I

Program: Deprexis
Content: 10 self-guided, internet-based modules (most grounded in evidence-based approaches)
Cost: ~1-2 sessions of therapy ($280)
Evidence: Solid meta-analytic evidence across >10 RCTs journals.plos.org/plosone/articl…
Link: orexo-store-2.mybigcommerce.com

Adults Part II

Program: MoodGYM
Content: 5 self-guided, internet-based modules (all grounded in CBT-based approaches)
Cost: <1 session of therapy ($27)
Evidence: Somewhat shaky meta-analytic evidence across >10 RCTs researchgate.net/profile/Conal-…
Link: moodgym.com.au

Read 8 tweets

Dr. Michael Mullarkey

@mcmullarkey

19 May

Still responding to folks re: my transition to data science post! I'll get to everyone, promise!

Given the interest I thought people might want to know the (almost all free/low cost!) resources I used to train myself for a data science role

A (hopefully helpful) 🧵

R, Part I

My first real #rstats learning experience was using swirl. I loved that I could use it inside of R (rather than having to go back and forth between the resource and the RStudio console)

swirlstats.com/students.html

@hadleywickham

R, Part II

A cliche rec, but it's cliche for a reason. R for Data Science by @hadleywickham & @StatGarrett transitioned me from "kind of messing around" to "wow, I did that cool thing" in R. It's absolutely a steal that it's available for free

r4ds.had.co.nz

Read 14 tweets

Dr. Michael Mullarkey

@mcmullarkey

28 Feb

I just found out a paper we first submitted ~3 years ago was accepted! We used an N > 1,000 sample, open data/code, and robust methods

I'm proud of this paper, and it also helped radicalize me against a lot of the stories we tell ourselves about peer review

A 🧵

The many reviews we received were almost uniformly hostile, confused, non-constructive, or some combination

The paper definitely got better throughout the process, and that had ~0 to do with the reviews

Real reason #1: A wonderful, ongoing collaboration with a stellar biostatistician/many other great collaborators

Real reason #2: I got better at coding/new tools became available

Read 22 tweets

Dr. Michael Mullarkey

@mcmullarkey

5 Sep 19

Trying to balance:
- Having genuine empathy for people who are staring down the barrel of their life's work not replicating
- Not reinforcing power structures and practices that led to a world where those barrels are all too common

@minzlicht

Hearing @minzlicht talk about this on the "Replication Crisis Gets Personal" @fourbeerspod episode brought home to me how lucky I am to be early in my career now as opposed to 20 or even 10 years ago

But his example* reminds me people in power have a choice when confronted with a much messier literature than initially described

They can double down, or they can engage meaningfully with a more complicated world

*And many others, my mentions aren't ever comprehensive!

Read 12 tweets

Support us! We are indie developers!

This site is made by just two indie developers on a laptop doing marketing, support and development! Read more about the story.

Become a Premium Member ($3/month or $30/year) and get exclusive features!

Become Premium

Too expensive? Make a small donation by buying us coffee ($5) or help with server cost ($10)

Donate via Paypal Become our Patreon

Thank you for your support!

Share this page!

Dr. Michael Mullarkey

Try unrolling a thread yourself!

More from @mcmullarkey

Dr. Michael Mullarkey

Dr. Michael Mullarkey

Dr. Michael Mullarkey

Dr. Michael Mullarkey

Dr. Michael Mullarkey

Dr. Michael Mullarkey

Did Thread Reader help you today?

Like this author's thread?