Too many broadly useful stats methods are masked in domain-specific language. In my new pair of posts, I discuss formula-free #causalinference design patterns to help data analysts recognize frameworks as they encounter them in everyday work
I don't rehash the finer details; for that my resource round-up post catalogues the plethora of amazing books freely available from @_MiguelHernan@causalinf@CasualBrady@nickchk and more
Instead, I simply focus on the bare-bones frameworks. While econ and epi Twitter talk about CI nonstop, I'm struck by how underutilized some of the basics are in industry where we have rich high-dimensional panel data, well-defined but non-random treatment mechanism, etc
• • •
Missing some Tweet in this thread? You can try to
force a refresh
Cut an #rstats scripts runtime from 2+ hours to <5 minutes and feel extremely powerful (even though arguably the first version was just bad code)
Don’t know who needs this but a few random tips below. Easy once you’ve heard them but often outside of intro content 👇🏻
Run iterations in parallel! If you’re using {purrr} this is *ridiculously* easy with @dvaughan32 ‘s {furrr}
You truly just add ‘future_’ prefixes to map functions
Remove anything from the iteration that can be done outside including data preprocessing (eg type conversion) or post processing (eg normalizing everything by the same constant)
Fantastic intro to forecasting building from basic principles to complex models. Also gives context to appreciate a lot of exciting work happening in {tidyverts} tidyverts.org