A new working paper for holiday reading! @peder_isager and I provide an introduction to three-sided testing, a framework for testing an estimate's practical significance. We offer a tutorial, Shiny app, + commands/code in #Rstats, Jamovi, and #Stata (🔗 below!) 1/9

#EconTwitter Image
Equivalence testing lets us test whether estimates are stat. sig. bounded beneath practically negligible effect size Δ (e.g., pink estimate). But estimates can be both stat. sig. diff. from zero and stat. sig. bounded beneath Δ. 2/9 Pink estimate is statistically significantly bounded between the delta bounds. Blue estimate is statistically significantly bounded above the upper delta bound. The confidence interval of the orange estimate intersects one of the delta bound, but does not cross zero.
Estimates can also be stat. sig. bounded outside of Δ (e.g., blue estimate). What should we conclude about estimates like these blue/orange estimates? Standard equivalence testing frameworks don't give us clear answers. We introduce researchers to a framework that does. 3/9 Pink estimate is statistically significantly bounded between the delta bounds. Blue estimate is statistically significantly bounded above the upper delta bound. The confidence interval of the orange estimate intersects one of the delta bound, but does not cross zero.
The three-sided testing (TST) framework combines two-sided minimum effects tests for inferiority/superiority with the two one-sided tests (TOST) equivalence testing procedure. TST can provide stat. sig. evidence that estimates are practically significant, or practically = 0. 4/9 Panel A shows an inferiority test, where H0 states that the estimate is greater than the lower delta bound and HA states that the estimate is less the lower delta bound. Panel B shows a TOST procedure, where the null hypothesis is that the estimate is either above the upper delta bound or below the lower delta bound, and the alternative hypothesis is that the estimate is between the delta bounds. Panel C shows a superiority test, where the null hypothesis states that the estimate is less than the upper delta bound, and the alternative hypothesis states that the estimate is above the upper d...
This procedure was developed by Goeman, @aldosolari, & Stijnen (2010), who show that by partitioning estimates' parameter spaces into disjoint regions, TST controls error rates over all three of its tests w/ no penalty to power. 5/9

onlinelibrary.wiley.com/doi/10.1002/si…
@aldosolari Practical significance conclusions about an estimate can be easily inferred from double-banded confidence intervals that combine the estimate’s (1 - α) CI (e.g., its 95% CI) with its (1 - 2α) CI (e.g., its 90% CI). 6/9 In the inferiority region, a 95% confidence interval is used to assess whether the estimate is significantly bounded beneath the lower delta bound. In the inferiority region, a 90% confidence interval is used to assess whether the estimate is significantly bounded within the delta bounds, even if the 95% confidence interval crosses one of the delta bounds. In the superiority region, a 95% confidence interval is used to assess whether the estimate is significantly bounded above the upper delta bound. The bottom four estimates have confidence intervals crossing the delta bounds, implying that...
@aldosolari To make things easy, we offer the ShinyTST app, a point-and-click Shiny app that tells you which test/confidence interval is relevant, provides p-values, and visualizes test results given an estimate, standard error, and SESOI. 7/9

jack-fitzgerald.shinyapps.io/shinyTST/Image
@aldosolari We also offer the tst() command in the eqtesting R package, the tsti command in Stata, and Jamovi code. You can visit the paper to find download instructions for all, + guidelines for implementation. We hope you find it useful! (8/9)

osf.io/preprints/psya…Image
@aldosolari For those on #EconTwitter, in addition to our PsyArXiv paper, we’ve also deposited a version into the Tinbergen Institute Discussion Paper Series: (9/9)

papers.tinbergen.nl/24077.pdf
@aldosolari @threadreaderapp unroll this

• • •

Missing some Tweet in this thread? You can try to force a refresh
 

Keep Current with Jack Fitzgerald | @jackfitzgerald.bsky.social

Jack Fitzgerald | @jackfitzgerald.bsky.social Profile picture

Stay in touch and get notified when new unrolls are available from this author!

Read all threads

This Thread may be Removed Anytime!

PDF

Twitter may remove this content at anytime! Save it as PDF for later use!

Try unrolling a thread yourself!

how to unroll video
  1. Follow @ThreadReaderApp to mention us!

  2. From a Twitter thread mention us with a keyword "unroll"
@threadreaderapp unroll

Practice here first or read more on our help page!

More from @FitzgeraldJack_

Nov 18
Do real stakes/incentives matter in experiments? Recent studies say they don’t. My new paper shows that these studies’ results — and those of most hypothetical bias experiments — are uninformative when we care about experimental treatment effects. 1/x
🔗: papers.tinbergen.nl/24070.pdfImage
Historically, experimental economists virtually always tied experimental choices to real stakes/payoffs to improve generalizability. That’s changing: many economists now use hypothetical stakes in online experiments + large-scale survey experiments. 2/x
There’s also recently been a wave of new studies showing that certain outcomes don’t stat. sig. differ between real-stakes and hypothetical-stakes experiments. These results are affecting thinking at the highest levels of experimental economics. 3/x Image
Read 21 tweets
Oct 10
🧵 on my replication of Moscona & Sastry (2023, QJE).
TL;DR: MS23 proxy 'innovation exposure' with a measure of heat. Using direct innovation measures from the paper’s own data decreases headline estimates of innovation’s mitigatory impact on climate change damage by >99.8%. 1/x
Moscona & Sastry (2023) reach two findings. First, climate change spurs agricultural innovation. Crops with croplands more exposed to extreme heat see increases in variety development and climate change-related patenting. 2/x
academic.oup.com/qje/article/13…
Second, MS23 find that innovation mitigates damage from climate change. They develop a county-level measure of 'innovation exposure' and find that agricultural land in counties with higher levels of 'innovation exposure' is significantly less devalued by extreme heat. 3/x Image
Read 33 tweets
Oct 2
My paper is out in @PNASNews! I replicate a paper on the impact of COVID vaccine mandates on vaccine uptake. Removing a single bad control variable sign-flips several of the paper’s headline results. The reply’s findings are also not robust. 1/x
pnas.org/doi/10.1073/pn…
@PNASNews Rains & Richards (2024) — henceforth RR24 — reach two findings. First, RR24 claim that difference-in-differences estimates show that US state COVID vaccine mandates had imprecise impacts on COVID vaccine uptake. 2/x
pnas.org/doi/10.1073/pn…
@PNASNews Second, RR24 find that states that mandated COVID vaccination statewide now see lower uptake of COVID boosters and both adult + child flu vaccines than states that banned local COVID vaccine mandates. 3/x
Read 25 tweets
Jul 22
🚨 WP alert! 🚨 I design equivalence tests for running variable (RV) manipulation in regression discontinuity (RDD), show that serious RV manipulation can't be ruled out in lots of published RDD research, and offer the lddtest command in Stata/R. 1/x

🔗: hdl.handle.net/10419/300277Image
Credible RDD estimation relies on the assumption that agents can’t endogenously sort their RVs to opt themselves into or out of treatment. If they can, then RDD estimates are confounded: agents who manipulate RVs are likely different in important ways from agents who don't. 2/x
Such manipulation often causes jumps in RV density at the cutoff, which can either come from genuine distributional distortions or from strategic reporting. E.g., consider the French examples below. 3/x

Read 16 tweets

Did Thread Reader help you today?

Support us! We are indie developers!


This site is made by just two indie developers on a laptop doing marketing, support and development! Read more about the story.

Become a Premium Member ($3/month or $30/year) and get exclusive features!

Become Premium

Don't want to be a Premium member but still want to support us?

Make a small donation by buying us coffee ($5) or help with server cost ($10)

Donate via Paypal

Or Donate anonymously using crypto!

Ethereum

0xfe58350B80634f60Fa6Dc149a72b4DFbc17D341E copy

Bitcoin

3ATGMxNzCUFzxpMCHL5sWSt4DVtS8UqXpi copy

Thank you for your support!

Follow Us!

:(