One interesting thing is that lag() and lead() take in a vector and return a vector (similar to ntile).
This means that these functions *should not* be auto-vectorized. So in addition to re-exporting, they are included on the package’s do-not-vectorize list.
In the near future, we will provide a mechanism to edit the package’s do-not-vectorize list of functions.
For frequently used functions, this means you won’t have to use the tilde-prefix notation to call them.
In earlier single-center study @umichmedicine, our paper and accompanying editorial framed our AUC 0.63 as a failure of “external” validity. The result was somewhat surprising bc other studies reported higher AUCs/sens/spec.
🧹Tidier.jl 0.6.0 is available on the #JuliaLang registry.
What’s new?
- New logo!
- distinct()
- n(), row_number() work *everywhere*
- `!` for negative selection
- pivoting functions are better
- bug fixes to mutate() and slice()
For years, I’ve been trying out different non-tidyverse implementations of tidyverse. It’s fun seeing folks mold languages to run analysis code inspired by it.
If you like screenshots of code, come along for a visual tour.
Let’s start w/ R.
If you thought that one tidyverse was enough for R, you would be wrong.
There are 2 fully independent re-implementations: {poorman} and {tidytable}.
{poorman} is powered by base R only - no dependencies! It’s a great pkg to use with binder/CI workflows.
{tidytable} has a similar premise, except it relies primarily on {data.table} and {tidyselect}.
While it’s similar to {dtplyr} in some ways, the syntax is even cleaner bc you don’t need to declare your data.table or use collect() to get the results.
When we link an intervention to a model threshold (eg alerts), we often worry about overalerting.
Overalerting can take on multiple forms. Either there are too many alerts bc many alerts are wrong. Or, there are too many alerts bc we lack capacity to act even if they are right.
Consider this: a model scoring 10 patients. Using a threshold of 20%, you identify 4 out of 5 patients needing ICU-level care.
We use 4 case studies to show how a resource constraint diminishes the usefulness of a model and changes the optimal resource allocation strategy.
We show that some of the usefulness can be recouped by introducing a relative constraint (and relaxing the absolute constraint).
All of the results in the paper can be reproduced using the accompanying {modelrecon} #rstats package and are shown in the README file accompanying the package.
My lab is moving to #JuliaLang, and I’ll be putting together some R => Julia tips for our lab and others who are interested.
Here are a few starter facts. Feel free to tag along!
Julia draws inspiration from a number of languages, but the influence of R on Julia is clear.
Let's start with packages.
Like R, Julia comes with a package manager that can be used to install pkgs from within the console (or REPL). The Pkg package isn't automatically imported in Julia but it's easy to do.
Both are different from Python's command line approach to pkgs.
Julia natively takes pkg management much further than R. Want to install a package from GitHub? Easy, just add a url argument to the add function.