Post

How to get URL link on X (Twitter) App

On the Twitter thread, click on or icon on the bottom
Click again on or Share Via icon
Click on Copy Link to Tweet
Paste it above and click "Unroll Thread"!
More info at Twitter Help

Karandeep Singh

@kdpsinghlab

Mar 18, 2023 • 9 tweets • 6 min read • Read on X

Scrolly

🧹Tidier.jl 0.6.0 is available on the #JuliaLang registry.

What’s new?

- New logo!
- distinct()
- n(), row_number() work *everywhere*
- `!` for negative selection
- pivoting functions are better
- bug fixes to mutate() and slice()

Docs: kdpsingh.github.io/Tidier.jl/dev/

A short tour.

If you use distinct() without any arguments, it behaves just like the #rstats {tidyverse} distinct().

It checks if rows are unique, and returns all columns just as you would expect.

If you use distinct() with arguments as shown here, then it returns all columns for unique values of the supplied column.

This is slightly different behavior than {tidyverse} distinct(), but I kind of like it. Can easily pair this with select() to mimic dplyr behavior.

Another thing that’s new is the helper function n().

While seemingly simple, implementing this was fairly difficult. When used inside of summarize/summarise, it behaves just like DataFrames.jl’s nrow() function.

So far, so good.

However, n() — and its counterpart row_number() — also work inside of mutate(), where nrow() from DataFrames.jl isn’t as straightforward to use, particularly inside of an expression (e.g. n() + 1).

n() provides a standard interface from all functions.

You can even use n() inside of slice().

To select the last 2 rows of a dataframe?

slice(n() - 1 : n()) — notice the order of operations is slightly different from R bc the `-` takes precedence over the `:` so no need for extra `()`.

Otherwise, this is exactly like R tidyverse.

And when the update says that n() and row_number() work *everywhere*, it’s really true.

You can even use row_number() inside of filter() to mimic the functionality provided by slice(), just like in R.

@PVDimens

A hearty thanks to @PVDimens @randyboyes @zzhaozhe @PietroMonticone @Gaspardelanoche @LazarusAlon for contributions to Tidier.jl, to many others in the #JuliaLang community for code, blogs, and inspiration, and to #rstats and @posit_pbc colleagues for the tidyverse ecosystem! 🙏

@PVDimens

And finally, ICYMI, here’s the new logo thanks to @PVDimens!

• • •

Missing some Tweet in this thread? You can try to force a refresh

This Thread may be Removed Anytime!

Twitter may remove this content at anytime! Save it as PDF for later use!

More from @kdpsinghlab

Karandeep Singh

@kdpsinghlab

Oct 12, 2023

https://twitter.com/annalsofim/status/1711834106009419777

When an outcome influences a predictor, it’s “outcome leakage.” But what about when a predictor influences an outcome?

With @AkhilVaidMD @girish_nadkarni et al, we simulated what happens when a model predicts a bad outcome, but then you intervene to prevent that outcome.

https://twitter.com/annalsofim/status/1711834106009419777

If you evaluate such a model *after* it has been linked to a clinical workflow, the model’s “apparent” performance will look worse.

People who were supposed to experience the outcome didn’t experience it (bc you prevented it!)

This matters most when interventions are effective.

But what if you update a model? Or if you build a 2nd related model after the first one is implemented? Or if you simultaneously implement 2 models where the intervention for one affects the predictors of the other?

You end up with scenarios where the performance looks worse.

Read 5 tweets

Karandeep Singh

@kdpsinghlab

Sep 30, 2023

🧹Amazing progress on TidierPlots.jl by @randyboyes.

What’s new?

1. It looks *just* like ggplot2 now - nearly all macros converted to functions.

2. Thanks to #JuliaLang’s multiple dispatch, you can add plots together using `+` OR use pipes.

3. ggsave()

4. Works with Pluto.jl

TidierPlots.jl is getting to be crazily feature-complete, even supporting `geom_text()`, `geom_label()`, and faceting.

And if you’re a #JuliaLang user of AlgebraOfGraphics, what’s super cool is that you can literally mix and match TidierPlots code with AlgebraOfGraphics code because TidierPlots uses AoG under the hood.

Read 4 tweets

Karandeep Singh

@kdpsinghlab

Apr 3, 2023

@_plyons

Why does a proprietary sepsis model “work” at some hospitals but not others?

Is it generalizability? Measurement? Intervention? Patient population? Margin for improvement? Resource constraints?

Working with a team led by @_plyons, we looked at a 9-hospital network.

A story.

https://twitter.com/prpayne5/status/1642917015739416577

@umichmedicine

In earlier single-center study @umichmedicine, our paper and accompanying editorial framed our AUC 0.63 as a failure of “external” validity. The result was somewhat surprising bc other studies reported higher AUCs/sens/spec.

Why?

jamanetwork.com/journals/jamai… jamanetwork.com/journals/jamai…

Colleagues at UColorado reported an AUC of 0.73, and at Case Metro reported sens/spec of 90%/68% - quite good.

But both of those results included scores calculated after sepsis onset (and in some cases recognition).

And this is especially important here because…

Read 22 tweets

Karandeep Singh

@kdpsinghlab

Apr 2, 2023

@KriseScheuch

🧹 Tidier.jl v0.7.1 is now on the #JuliaLang registry.

What’s new?

- drop_na()
- lag() and lead() - re-exported from ShiftedArrays.jl
- Bugfix to ntile() if all values are missing

Thanks to @KriseScheuch for feature suggestions!

github.com/kdpsingh/Tidie…

One interesting thing is that lag() and lead() take in a vector and return a vector (similar to ntile).

This means that these functions *should not* be auto-vectorized. So in addition to re-exporting, they are included on the package’s do-not-vectorize list.

https://twitter.com/longemen3000/status/1628886335304962049

In the near future, we will provide a mechanism to edit the package’s do-not-vectorize list of functions.

For frequently used functions, this means you won’t have to use the tilde-prefix notation to call them.

See here for details on auto-vectorization: kdpsingh.github.io/Tidier.jl/dev/…

https://twitter.com/longemen3000/status/1628886335304962049

Read 5 tweets

Karandeep Singh

@kdpsinghlab

Feb 25, 2023

A Visual Tour of the Meta-Tidyverse

For years, I’ve been trying out different non-tidyverse implementations of tidyverse. It’s fun seeing folks mold languages to run analysis code inspired by it.

If you like screenshots of code, come along for a visual tour.

Let’s start w/ R.

If you thought that one tidyverse was enough for R, you would be wrong.

There are 2 fully independent re-implementations: {poorman} and {tidytable}.

{poorman} is powered by base R only - no dependencies! It’s a great pkg to use with binder/CI workflows.

cran.r-project.org/web/packages/p…

{tidytable} has a similar premise, except it relies primarily on {data.table} and {tidyselect}.

While it’s similar to {dtplyr} in some ways, the syntax is even cleaner bc you don’t need to declare your data.table or use collect() to get the results.

github.com/markfairbanks/…

Read 16 tweets

Karandeep Singh

@kdpsinghlab

Feb 23, 2023

https://twitter.com/kdpsinghlab/status/1628464344785727489

If a tree falls in the forest but there’s no one around to hear it, does it really make a sound?

If a model detects a patient in need of ICU-level care but there are no ICU beds available, did the model really help the patient?

https://twitter.com/kdpsinghlab/status/1628464344785727489

When we link an intervention to a model threshold (eg alerts), we often worry about overalerting.

Overalerting can take on multiple forms. Either there are too many alerts bc many alerts are wrong. Or, there are too many alerts bc we lack capacity to act even if they are right.

Consider this: a model scoring 10 patients. Using a threshold of 20%, you identify 4 out of 5 patients needing ICU-level care.

The sensitivity is 80%, right?

But what if you only have 3 beds available?

Read 8 tweets

Support us! We are indie developers!

This site is made by just two indie developers on a laptop doing marketing, support and development! Read more about the story.

Become a Premium Member ($3/month or $30/year) and get exclusive features!

Become Premium

Don't want to be a Premium member but still want to support us?

Make a small donation by buying us coffee ($5) or help with server cost ($10)

Donate via Paypal

Or Donate anonymously using crypto!

Ethereum

0xfe58350B80634f60Fa6Dc149a72b4DFbc17D341E copy

Bitcoin

3ATGMxNzCUFzxpMCHL5sWSt4DVtS8UqXpi copy

Thank you for your support!

Share this page!

Enter URL or ID to Unroll

Karandeep Singh

Try unrolling a thread yourself!

More from @kdpsinghlab

Karandeep Singh

Karandeep Singh

Karandeep Singh

Karandeep Singh

Karandeep Singh

Karandeep Singh

Did Thread Reader help you today?

Don't want to be a Premium member but still want to support us?

Send Email!