Latest Twitter Threads by @kdpsinghlab on Thread Reader App

Oct 12, 2023 • 5 tweets • 1 min read

When an outcome influences a predictor, it’s “outcome leakage.” But what about when a predictor influences an outcome?

With @AkhilVaidMD @girish_nadkarni et al, we simulated what happens when a model predicts a bad outcome, but then you intervene to prevent that outcome.

https://twitter.com/annalsofim/status/1711834106009419777

If you evaluate such a model *after* it has been linked to a clinical workflow, the model’s “apparent” performance will look worse.

People who were supposed to experience the outcome didn’t experience it (bc you prevented it!)

This matters most when interventions are effective.

Sep 30, 2023 • 4 tweets • 2 min read

🧹Amazing progress on TidierPlots.jl by @randyboyes.

What’s new?

1. It looks *just* like ggplot2 now - nearly all macros converted to functions.

2. Thanks to #JuliaLang’s multiple dispatch, you can add plots together using `+` OR use pipes.

3. ggsave()

4. Works with Pluto.jl

TidierPlots.jl is getting to be crazily feature-complete, even supporting `geom_text()`, `geom_label()`, and faceting.

Apr 3, 2023 • 22 tweets • 11 min read

Why does a proprietary sepsis model “work” at some hospitals but not others?

Is it generalizability? Measurement? Intervention? Patient population? Margin for improvement? Resource constraints?

Working with a team led by @_plyons, we looked at a 9-hospital network.

A story.

https://twitter.com/prpayne5/status/1642917015739416577

In earlier single-center study @umichmedicine, our paper and accompanying editorial framed our AUC 0.63 as a failure of “external” validity. The result was somewhat surprising bc other studies reported higher AUCs/sens/spec.

Why?

jamanetwork.com/journals/jamai… jamanetwork.com/journals/jamai…

Apr 2, 2023 • 5 tweets • 3 min read

🧹 Tidier.jl v0.7.1 is now on the #JuliaLang registry.

What’s new?

- drop_na()
- lag() and lead() - re-exported from ShiftedArrays.jl
- Bugfix to ntile() if all values are missing

Thanks to @KriseScheuch for feature suggestions!

github.com/kdpsingh/Tidie…

One interesting thing is that lag() and lead() take in a vector and return a vector (similar to ntile).

This means that these functions *should not* be auto-vectorized. So in addition to re-exporting, they are included on the package’s do-not-vectorize list.

Mar 18, 2023 • 9 tweets • 6 min read

🧹Tidier.jl 0.6.0 is available on the #JuliaLang registry.

What’s new?

- New logo!
- distinct()
- n(), row_number() work *everywhere*
- `!` for negative selection
- pivoting functions are better
- bug fixes to mutate() and slice()

Docs: kdpsingh.github.io/Tidier.jl/dev/

A short tour.

If you use distinct() without any arguments, it behaves just like the #rstats {tidyverse} distinct().

It checks if rows are unique, and returns all columns just as you would expect.

Feb 25, 2023 • 16 tweets • 12 min read

A Visual Tour of the Meta-Tidyverse

For years, I’ve been trying out different non-tidyverse implementations of tidyverse. It’s fun seeing folks mold languages to run analysis code inspired by it.

If you like screenshots of code, come along for a visual tour.

Let’s start w/ R. If you thought that one tidyverse was enough for R, you would be wrong.

There are 2 fully independent re-implementations: {poorman} and {tidytable}.

{poorman} is powered by base R only - no dependencies! It’s a great pkg to use with binder/CI workflows.

cran.r-project.org/web/packages/p…

Feb 23, 2023 • 8 tweets • 2 min read

If a tree falls in the forest but there’s no one around to hear it, does it really make a sound?

If a model detects a patient in need of ICU-level care but there are no ICU beds available, did the model really help the patient?

https://twitter.com/kdpsinghlab/status/1628464344785727489

When we link an intervention to a model threshold (eg alerts), we often worry about overalerting.

Overalerting can take on multiple forms. Either there are too many alerts bc many alerts are wrong. Or, there are too many alerts bc we lack capacity to act even if they are right.

Feb 22, 2023 • 5 tweets • 4 min read

Why do seemingly useful models fail to improve clinical outcomes when implemented? Resource constraints.

In this paper, we describe constraints, how they affect net benefit, and how they apply to other measures.

Paper: academic.oup.com/jamia/advance-…

R pkg: github.com/ML4LHS/modelre…

We use 4 case studies to show how a resource constraint diminishes the usefulness of a model and changes the optimal resource allocation strategy.

We show that some of the usefulness can be recouped by introducing a relative constraint (and relaxing the absolute constraint).

Jan 31, 2023 • 33 tweets • 9 min read

My lab is moving to #JuliaLang, and I’ll be putting together some R => Julia tips for our lab and others who are interested.

Here are a few starter facts. Feel free to tag along!

Julia draws inspiration from a number of languages, but the influence of R on Julia is clear. Let's start with packages.

Like R, Julia comes with a package manager that can be used to install pkgs from within the console (or REPL). The Pkg package isn't automatically imported in Julia but it's easy to do.

Both are different from Python's command line approach to pkgs.

Sep 29, 2022 • 11 tweets • 5 min read

The new FDA guidance on CDS software is important but not for the reasons you might expect.

tl;dr: This document clarifies what the FDA *isn't going to regulate* and says little about *how* it's going to regulate CDS it considers to be a device.

Link: fda.gov/media/109618/d…

While the FDA was established formally by the FD&C Act in 1938, it didn't gain the authority to regulate medical devices until 1976 when the FD&C Act was amended.

ncbi.nlm.nih.gov/pmc/articles/P…

Feb 22, 2022 • 10 tweets • 4 min read

IMO, this is the *biggest* development in the R language since the pipe was first introduced.

To understand why, you need to know a bit about Flash, Java, JavaScript, LLVMs, Emscripten, and asm.js.

GH repo:
GH repo for R packages: github.com/georgestagg/we…
github.com/georgestagg/we…

https://twitter.com/gwstagg/status/1495495339444473858

When the web was first introduced, there wasn't a clear choice of what scripting language should be used, before the world settled on using JavaScript, which implements the ECMAScript specification (see here: ).

Should browsers bother running code?tc39.es/ecma262/

Feb 5, 2022 • 13 tweets • 6 min read

A prediction model paradox in health: while many are developed, few are recommended.

But when models *are* recommended, they often come from tertiary care hospitals.

Is this a problem? We tested 3 prostate ca models in regional/ national data.

Paper: auajournals.org/doi/pdf/10.109…

When people talk about risk stratifying cancer outcomes, there’s an implicit assumption that’s what being modeled is biology.

But whose biology? Patients who present to tertiary care centers are often more complex, and only some of that complexity is measurable.

Jul 14, 2021 • 10 tweets • 5 min read

As more health systems adopt some form of clinical AI/ML governance, one of the biggest challenges they face is the monitoring of deployed models.

And one of the scariest phenomena that occurs in deployed models is dataset shift.

Why scary?
- Often silent
- Can lead to harm

https://twitter.com/IAmSamFin/status/1415417258873131011

Why silent? Shouldn’t it be obvious if models get miscalibrated over time?

This is called “calibration drift” and may be clinically obvious if it occurs rapidly. However, it can occur gradually and be missed.

Also, calibration drift is only a *small* subset of dataset shift.

https://twitter.com/kdpsinghlab/status/1186114527668199425

Jun 24, 2021 • 19 tweets • 7 min read

Let's talk limitations.

One of Epic’s public criticisms is that our analysis was “hypothetical” and thus problematic. They’re not wrong about it being hypothetical, and there *are* some real limitations worth pointing out. Let’s dive into our paper’s limitations, big and small.

https://twitter.com/kdpsinghlab/status/1407208969039396866

Our biggest limitation is that our results come from a single center.

Tho I know of another center with a similar AUC to us using similar code, we did not combine into 1 paper bc their scoring started shortly before COVID, which introduces some complications. More to come...

Jun 22, 2021 • 30 tweets • 16 min read

Thank you for folks who have shared or commented on our paper. I know the paper is being used by some to dunk on Epic. Rather than piling on, I want to provide a clear-eyed view of what we found, what it means, and what I would suggest to Epic (& other model devs) going forward.

https://twitter.com/jamainternalmed/status/1407005406514319361

Here are some questions that come up:
- Are our findings due to a configuration error at @umichmedicine?
- Why do our findings differ from what Epic reports?
- Why does hospitalization-level AUC go from 0.63 to 0.80 in the sensitivity analysis?
- Are we using the model today?

Mar 16, 2021 • 15 tweets • 6 min read

Now that we’ve discussed discrimination, calibration, and decision curve analysis (DCA), let’s talk about Scenario 2.

I am fascinated to know why folks felt that the model should not be used. Isn’t the model good?! AUC 0.75 and well-calibrated.

I agreed w/ majority. Here’s why.

https://twitter.com/kdpsinghlab/status/1370216736130220037

First, let’s poll folks who felt the model shouldn’t be used. What aspect of the model were you dissatisfied with?

Mar 15, 2021 • 8 tweets • 3 min read

One property about net benefit on decision curves (as we found out below) is that the treat-all strategy has both an x-intercept and y-intercept at the same point as the proportion of patients experiencing the outcome.

Ignoring the x-axis for a moment, we will talk about y-axis.

https://twitter.com/kdpsinghlab/status/1370978346763444224

The maximal net benefit of a model in a given setting is determined by the proportion of people who experience the outcome.

If one hospital has a higher % of pts experiencing the outcome, the model will have a higher net benefit assuming all else is equal.

Most ppl get this.

https://twitter.com/kdpsinghlab/status/1367569074759294978

Mar 14, 2021 • 4 tweets • 1 min read

One conceptual question that comes up about decision curves and net benefit is: Don’t you need a clinical trial to establish benefit?

You do, but decision curve analysis is like a power calculation.

If your model is expected to confer negative benefit, why do a trial at all? So if the model has a positive net benefit, why do a trial?

Because the current standard of care isn’t quite treat-all or treat-none. It’s somewhere in between but we rarely have a window into how clinicians estimate risks in current practice, even if qualitative.

Mar 14, 2021 • 25 tweets • 9 min read

Thanks everyone for your votes on each of the scenarios. In this thread, I’ll walk through Scenario 1 — both how I thought about it originally, and how decision curves can help.

Let’s get started.

Popular opinion was to use neither model. My vote was for Model B. Here’s why.

https://twitter.com/kdpsinghlab/status/1370216729889140737

I’ll get to why I voted for Model B but I’ll start in order and share everything I looked at to arrive at that opinion.

From the threshold-perf plot (TPP), we can tell that the outcome occurs in ~25% bc the PPV at a threshold of 0 (left), where all preds are positive, is 25%.

Mar 2, 2021 • 17 tweets • 6 min read

I initially said that the reason I think ppl get confused about decision curve analysis (DCA) was because they don't think of the decision threshold as being connected to the cost/benefit ratio.

I'm realizing this is not a mathematical issue but a conceptual one.

A thread.

One issue that touched a nerve was that in my example, I calculated a post-hoc threshold based on sensitivity. Was I wrong to do this? Let me give an example as to why this happened in this situation, why it *might've* been our only option, and how I would do it differently.

https://twitter.com/vickersbiostats/status/1366068394408222730

Feb 28, 2021 • 9 tweets • 3 min read

Decision curves are a way to understand the population-level "net benefit" of implementing a prediction model *without* needing to *explicitly* account for actual costs and benefits.

Even experts seemingly don't understand it, and admit it!

How is this possible? Is it laziness?

Lack of understanding isn't for a lack of trying on the part of its authors. There are dozens of papers w/ thousands of citations!

So what I can possibly add? Personal experience. I'll walk through why I was confused and how it finally clicked for me.

mskcc.org/departments/ep…

Share this page!

Enter URL or ID to Unroll