Now that everyone is (justifiably) up in arms about CurateScience, may I turn your attention to SciScore™, claimed to be "the best methods review tool for scientific articles." (direct quote, plastered all over their website)

sciscore.com
For starters, for a "tool" all about transparency, reproducibility, etc., it's pretty damn hard to find what it is they are actually reviewing for.

There is no page explaining it, nothing in the FAQ, and you have to dig around a bit.
To find that, there is a link on the "RTI" tab that goes to the preprint:

"Rigor and Transparency Index, a new metric of quality for assessing biological and medical science methods"

doi.org/10.1101/2020.0…

FWIW; having a paper for their metric is a good thing, at least.
So what is it that "the best methods review tool for scientific articles" has selected as its indicators?

This. That's it. That's the whole thing.

This is an extremely shallow and unsophisticated set of metrics that are being inappropriately applied universally.
Now, nothing on this list is inherently a bad thing, but it's not even REMOTELY a meaningful framework for methodological quality or review. It's shallow, weak, vastly overgeneralized, and completely unvalidated against better metrics.

As a "methods review tool" it's nonsense.
One might want to start with a well-vetted framework, thorough discussion of the review methods and ideas before, say, tossing a machine learning algorithm on that framework, producing "reports" and tweeting them out for massive numbers of preprints in the bio/med lit, maybe?
It has all of the same basic, foundational problems as CurateScience's "audits" have, plus more.

On top of how hard to actually see their "tool", it's pretty hard to figure out who makes/runs it. That's really not good for something about transparency and accountability.
Please don't searching for them go piling on, this is not to shame the individuals making it who are likely to be well intended. It's just to point out that the tools we use to rigorously evaluate transparency and rigor must themselves be transparent and rigorous.
There are probably dozens of SciScores and CurateSciences and worse out there that we just haven't seen yet. It's inevitable.

If we can't collectively get our acts together and start investing in impactful and rigorous solutions to these major problems, that's all we'll get.

• • •

Missing some Tweet in this thread? You can try to force a refresh
 

Keep Current with Noah Haber

Noah Haber Profile picture

Stay in touch and get notified when new unrolls are available from this author!

Read all threads

This Thread may be Removed Anytime!

PDF

Twitter may remove this content at anytime! Save it as PDF for later use!

Try unrolling a thread yourself!

how to unroll video
  1. Follow @ThreadReaderApp to mention us!

  2. From a Twitter thread mention us with a keyword "unroll"
@threadreaderapp unroll

Practice here first or read more on our help page!

More from @NoahHaber

20 Mar
Systematic reviews and meta-analyses are like plywood. While they often have a pretty veneer, they are only as useful as the layers of materials they are made of and how it's all put together.

In this essay I will
Hm, might do this, since plywood is so, so much cooler than people give it credit for, and there's some good analogy making with how cross-grain layers are complementary and hold things in check.

I have nerd sniped myself.
Ugh, fine, I'll do it.

Systematic reviews and meta-analyses are like plywood, and plywood is crazy cool stuff that I bet you never even thought about.

But first we have to talk about wood.

Thread!
Read 19 tweets
19 Mar
FWIW, having "grown up" in econ (and now spending 90% of my time in a different field entirely), this statement strikes me as a pretty accurate description of economists as a whole, and a major source of inter-field friction.
I do think that there is something to the fungibility of a lot of econ-style frameworks and ways of approaching problems, BUT in combination with hyperconfidence it gets econs (including me) in trouble.

I've had to learn to unlearn a lot of that hyperconfidence.
Note: fungibility is NOT AT ALL the same thing as superiority, and I think that particular line may be where the error is.

I (clearly) think there is a HUGE amount of untapped value in bridging disciplinary gaps, as indicated on the fact that I've bet the farm on it.
Read 6 tweets
8 Mar
At the risk of getting involved in a discussion I really don't want to be involved in:

Excepting extreme circumstances, even very effective or very damaging policies won't produce discernable "spikes" or "cliffs" in COVID-19 outcomes over time.

That includes school policies.
"There was no spike after schools opened" doesn't mean that school opening didn't cause (ultimately) large increases in COVID cases.

Similarly "There was no cliff after schools closed" doesn't really mean that the school closure didn't substantially slow spread.
That's one of the things that makes measurement of this extremely tricky; the effects of school policies would be expected to appear slowly over time, and interact with the local conditions over that period of time.

Infectious diseases are super sneaky that way.
Read 7 tweets
7 Mar
I am in this picture, and honestly it's the most important thing.
Poorly designed quality assessment metrics are arguably even more objectionable than poorly designed primary methods.

Two examples of VERY poorly designed quality metrics are the Newcastle-Ottowa scale and SciScore.

Compare to something like ROBINS-I (which is really good).
One of these days I should really make a quality assessment metric assessment metric (or at least write about what makes the bad ones bad)
Read 5 tweets
16 Feb
YES!!!

Most days in metascience, it feels like the odds are impossible, it's hard to believe that we'll ever make any progress at all.

And then every so often, something great happens.

This is a big deal for the future of science and publication, and I am STOKED!
Full disclosure: I contribute every so often to the NCRC team under the fantastic leadership of @KateGrabowski and many others, and have been a fan of both NCRC and eLife since they started (well before I started helping).
At some point I'll do a long thread about why this small thing is a WAY bigger deal than it sounds, but to tease: this heralds active exploration of a fundamental and long overdue rethinking and reorganizing of how science is assessed and distributed.
Read 4 tweets
14 Feb
Folks: There are serious statistical, design, language, and ethics concerns with that vitamin-D RCT.

AT BEST, it's completely meaningless due to negligent statistical treatment and design, but there's more "questions"

Help us out: avoid sharing until the critics have had time.
Seriously, we (people who are frankly pretty good at this) are having a very hard time figuring out what the hell is happening in that trial.

Please give us time to do our work.
This thread is a good place to start if you want a taste.

But "super sus" is right; there is just so much here that doesn't make any sense at all, and this thread only scratches the surface.

It's gonna be a while before we figure this out.

Read 46 tweets

Did Thread Reader help you today?

Support us! We are indie developers!


This site is made by just two indie developers on a laptop doing marketing, support and development! Read more about the story.

Become a Premium Member ($3/month or $30/year) and get exclusive features!

Become Premium

Too expensive? Make a small donation by buying us coffee ($5) or help with server cost ($10)

Donate via Paypal Become our Patreon

Thank you for your support!

Follow Us on Twitter!