Profile picture
Sanjay Srivastava @hardsci
, 47 tweets, 7 min read Read on Twitter
About to start - a symposium session on fresh* results from a multi-site ego depletion study #spsp2018

* Like apparently even the participating labs don’t know the results yet (?)
Vohs will give background and approach. @BJSchmeichel will present results. @DavidFunder will discuss (@EJWagenmakers couldn't make it in person)
Vohs: Purpose was to address challenges to the empirical basis of depletion. I wanted to be able to do something that I'd stand behind regardless of results
"Paradigmatic replication" - in contrast to previous studies this was not based on 1 particular study, instead experts came up with IVs and DVs based on theory and empirical track record
Paradigmatic replication not tied to one group or one author (who might feel targeted), instead shifts focus to construct. Used crowd-sourcing by experts
This project used 2 protocols, each lab ran one of them. Both were considered sound representations of constructs. Both used actual behavior
Everything was preregistered on aspredicted.org: Procedures, hypotheses, conditional statements, manip checks, moderators, exclusion criteria. Analysts were blinded to main DVs. Advisory board heavily involved in crafting the preregistration
"High touch approach": Labs gave feedback on IVs/DVs and chose which paradigm to run. Vohs created methods videos for the labs to model the procedure
[BTW I'm trying to just condense and paraphrase speakers unless I specifically say something's my commentary. A comment: lots of really thoughtful and impressive stuff in this approach]
Labs also created videos of them running mock participants for quality control. Videos/materials/etc will all be available online. All actual participants run one by one carefully
Neither Vohs nor Schmeichel ever touched the data. Advisory board was not depletion researchers
Layers of checks and balances to ensure checks and balances. Vohs and Schmeichel are proponents - they wanted themselves insulated from things that could introduce bias
Now @BJSchmeichel is going to speak to talk about specifics of methods and results
Schmeichel: Enrolled 41 labs, 36 stayed in to completion. 27 in USA, 9 in other countries. 34 labs analyzed so far. Results are preliminary (but with 34/36 labs reporting don't expect much to change)
One paradigm: Manipulate self-control with a habit-breaking task, measure performance on impossible puzzles ("e-task protocol")
Habit-breaking task: Cross out all the e's on a page 1 - long page of text. (Labs scored pages as a check on participant compliance.) Page 2: no-depletion cond Ss continue crossing out; depletion cond Ss have to follow new, complicated rules that break the habit
Then participants have to do a figure tracing task without picking up a highlighter or retracing lines. Practice figure was possible, then more challenging figures . Two were unsolvable. DV was # of attempts and time spent on task (avg of standardized scores)
Manipulation checks and trait moderators were measured after the task
Protocol 2 ("writing task protocol") tested the theoretical hypothesis with different methods. IV was write a story about a recent trip you've taken (control cond). Or same + don't use letters "a" or "n" in the story (depletion cond)
DV was Cognitive Estimation Test. Ss make estimates of unknown quantities. (Frontal lobe patients are bad at the task.) e.g. "How many chips are in a 1 oz bag?"
Performance on CET is scored by assigning points based on similarity to normative responses. Closer-to-normative responses = higher score on CET. Prediction is depletion lowers scores on CET
3346 participant in current analysis. Preregistered analyses with and without exclusions. Primary predictions: fx on manip checks; poorer performance on DVs in depletion conditions; and tests of moderation
[GETTING CLOSE TO RESULTS PEOPLE]
Frequentist analyses were conducted by Sophie Lohmann and Dolores Albarracin
Frequentist results (multilevel models): Significant with exclusions but not without [I think, might have gotten that backward]. No interaction with protocol
Forest plot: 11 labs had directionally opposite effect, 23 found directionally consistent effect
Moderator analysis: Interaction between condition and fatigue is significant. Effect seems to be carried by the highest quartile of fatigue
Bayesian analysis was by Quentin Gronau and @EJWagenmakers. BF compares a point null versus a prior for "the effect exists." Informed prior was preregistered: d=.3, sd=.15, truncated at zero
Plot of BFs from each study: Most labs were between 3 and 1/3. Two labs greater than 3. 6 labs [I think - squinting at slide] less than 1/3
Overall: Data are about 4 times more likely under the null than under the ego depletion exists hypothesis. Moderate evidence *against* the hypothesis
Bayesian meta-analysis. Fixed and random effects, data determined a graded preference. Meta-analytic effect size is d = .08 [If I'm reading off the slide right]
Wagenmakers: "If the depletion hypothesis is nonetheless true, the effect is relatively small"
Conclusion: Small, significant effect not moderated by protocol, moderated by subjective fatigue. Evidence does not support the theory but is not strong evidence against either
And now David Funder (@DavidFunder) is giving a discussion and reflections. He was not involved in the project
Funder: I'm a fan of Brunswik and representative design. If you haven't heard of it, shame on you [friendly tone]. Importance of varying stimuli and DVs, don't just pick one to represent a whole construct/hypothesis/theory etc
Use of multiple IVs and DVs was a strength. So was use of behavior rather than self-report
Bottom line of findings: There is a real (non-zero) but small effect. I am going to come back to what "small" means
It's an intuitive finding/hypothesis. Previous work suggests that even "physical" sense of fatigue is not a simple or direct function of actual bodily fatigue (lactic acid, muscle wear, stuff like that). "Physical" fatigue is in CNS not peripheral
As a field we are not good at evaluating effect sizes. Two usual ways, one bad and one worse. Bad way: automatically using Cohen's benchmarks. Worse way: square it to get %variance. Squaring an ES adds no information, just changes units (but rhetorically makes things sound small)
We need better benchmarks. One kind is other findings in the field. Average ES (Richard et al) is around r = .2. Those have to be overestimates because of publication bias
Another kind of benchmark is practical implications. Abelson (Psych Bull) looked at baseball statistics. Batting average predicts getting a hit at r = .05. Funder: First of all, square that" (laughter)
Small effect sizes can accumulate (as in baseball example). The practically meaningful effect is the cumulative one. "Life gives us a lot of at bats"
We feel bad about our small effect sizes because we see other people reporting big ones and think they are real. We need to rethink what ES is plausible, reasonable to expect, what is important (smaller than we thought)
"The replication controversy has started to make me at least ego-depleted." "I've heard it." "But my ego reserves are restored by projects like this that try to address issues with data"
Q&A: @BrianNosek "This is an exemplar of good science" [I agree!]
Session is concluded. That wraps it up. Thanks if you've stuck with my live-tweeting (and sorry if you didn't care :P)
POSTSCRIPT: I tried to mostly stick to condensing and paraphrasing the speakers in this thread. I posted my own thoughts/comments in another thread here:
Missing some Tweet in this thread?
You can try to force a refresh.

Like this thread? Get email updates or save it to PDF!

Subscribe to Sanjay Srivastava
Profile picture

Get real-time email alerts when new unrolls are available from this author!

This content may be removed anytime!

Twitter may remove this content at anytime, convert it as a PDF, save and print for later use!

Try unrolling a thread yourself!

how to unroll video

1) Follow Thread Reader App on Twitter so you can easily mention us!

2) Go to a Twitter thread (series of Tweets by the same owner) and mention us with a keyword "unroll" @threadreaderapp unroll

You can practice here first or read more on our help page!

Did Thread Reader help you today?

Support us! We are indie developers!


This site is made by just three indie developers on a laptop doing marketing, support and development! Read more about the story.

Become a Premium Member and get exclusive features!

Premium member ($3.00/month or $30.00/year)

Too expensive? Make a small donation by buying us coffee ($5) or help with server cost ($10)

Donate via Paypal Become our Patreon

Thank you for your support!