Jack Fitzgerald | @jackfitzgerald.bsky.social Profile picture
Economics PhD candidate @VUamsterdam and @ResearchTI. Working on applied econometrics, replication, + economics of science. @UvA_Amsterdam, @FloridaState alum.
Jan 22 7 tweets 2 min read
New working paper! @I4Replication does a good job detailing the experiment’s results, so for anyone who considers LLMs/AI as a (soon-to-be) solution to reproducibility in social science, let me walk you through the 🔥hell🔥 of trying to reproduce a paper on an "AI-led" team. 1/7 In AI Replication Games, we were randomized into human, cyborg (AI-assisted), + machine (AI-led) teams, all trying to reproduce a published paper. Humans couldn't use ChatGPT, whereas cyborgs could use it as much/as little as desired. My machine team could *only* use ChatGPT. 2/7
Dec 20, 2024 10 tweets 4 min read
A new working paper for holiday reading! @peder_isager and I provide an introduction to three-sided testing, a framework for testing an estimate's practical significance. We offer a tutorial, Shiny app, + commands/code in #Rstats, Jamovi, and #Stata (🔗 below!) 1/9

#EconTwitter Image Equivalence testing lets us test whether estimates are stat. sig. bounded beneath practically negligible effect size Δ (e.g., pink estimate). But estimates can be both stat. sig. diff. from zero and stat. sig. bounded beneath Δ. 2/9 Pink estimate is statistically significantly bounded between the delta bounds. Blue estimate is statistically significantly bounded above the upper delta bound. The confidence interval of the orange estimate intersects one of the delta bound, but does not cross zero.
Nov 18, 2024 21 tweets 6 min read
Do real stakes/incentives matter in experiments? Recent studies say they don’t. My new paper shows that these studies’ results — and those of most hypothetical bias experiments — are uninformative when we care about experimental treatment effects. 1/x
🔗: papers.tinbergen.nl/24070.pdfImage Historically, experimental economists virtually always tied experimental choices to real stakes/payoffs to improve generalizability. That’s changing: many economists now use hypothetical stakes in online experiments + large-scale survey experiments. 2/x
Oct 10, 2024 33 tweets 10 min read
🧵 on my replication of Moscona & Sastry (2023, QJE).
TL;DR: MS23 proxy 'innovation exposure' with a measure of heat. Using direct innovation measures from the paper’s own data decreases headline estimates of innovation’s mitigatory impact on climate change damage by >99.8%. 1/x Moscona & Sastry (2023) reach two findings. First, climate change spurs agricultural innovation. Crops with croplands more exposed to extreme heat see increases in variety development and climate change-related patenting. 2/x
Oct 2, 2024 25 tweets 8 min read
My paper is out in @PNASNews! I replicate a paper on the impact of COVID vaccine mandates on vaccine uptake. Removing a single bad control variable sign-flips several of the paper’s headline results. The reply’s findings are also not robust. 1/x
pnas.org/doi/10.1073/pn… @PNASNews Rains & Richards (2024) — henceforth RR24 — reach two findings. First, RR24 claim that difference-in-differences estimates show that US state COVID vaccine mandates had imprecise impacts on COVID vaccine uptake. 2/x
Jul 22, 2024 16 tweets 5 min read
🚨 WP alert! 🚨 I design equivalence tests for running variable (RV) manipulation in regression discontinuity (RDD), show that serious RV manipulation can't be ruled out in lots of published RDD research, and offer the lddtest command in Stata/R. 1/x

🔗: hdl.handle.net/10419/300277Image Credible RDD estimation relies on the assumption that agents can’t endogenously sort their RVs to opt themselves into or out of treatment. If they can, then RDD estimates are confounded: agents who manipulate RVs are likely different in important ways from agents who don't. 2/x