SPRITE is here.

It took a while, but it's here.

What is SPRITE? It's a tool for turning descriptive statistics in a scientific paper into plausible distributions.

The preprint is attached.

peerj.com/preprints/2696…
And what does that mean?

SPRITE reconstructs plausible histograms from descriptive statistics. Peep the attached - from the mean, SD, n, and endpoints (we almost always have these), we can generate a whole class of possible histograms.
What can we do with these histograms? A lot.

But mainly: we can investigate published papers with no raw data for plausibility. Basically, SPRITE lets you see what lies beneath the surface-level descriptions of a scientific result. (Sometimes, of course. Not all the time.)
And with a few tweaks, we can get creative with it. Find or exclude a certain skew. Include or exclude values. Start from an existing distribution. Generate ranges for test statistics. Reverse-engineer within-subjects data (hard, but possible).
But, most importantly, this means we can find data that's inconsistent. Mistakes, oversights, typos ... and worse.

If you make up an impossible mean/SD, SPRITE will flag it. If you make up implausible data, SPRITE can find it. Again, not all the time. But enough.
What's it programmed in? You'll love this - there's not one SPRITE, there's THREE:

1) MATLAB
2) R
3) Python

And if you don't want to download the code, there are TWO web-browser versions. This is @sTeamTraen's Shiny version. You can use it right now.

steamtraen.shinyapps.io/rsprite/
And this is @OmnesResNetwork's special sexy Python version.

(If you want the raw code, check the preprint.)

prepubmed.org/sprite/
Questions!

Q: "Is it like GRIM?"
A: It's related, yes. GRIM tells you if a mean can exist (GRIMMER tells you if an SD can exist or not). SPRITE tells you *if the mean/SD can exist, what does it look like?*

It means you can SEE the data, not just consistency-check it.
BUT. GRIM only works on small n's. SPRITE works on everything. Any range, any sample size. And even for big samples, it's fast. Super-fast.

And I honestly don't think it's fully optimized, even in its present state.
Q: "Is it complicated?"
A: not *really*. It's more complicated than when I first wrote about it (attached). But it's still conceptually similar: (a) we make a fake sample with the right mean (b) we shuffle values between bins until we have the right SD.

hackernoon.com/introducing-sp…
Q: "Where's the name from?"
A: Not the soft drink. A 'sprite' is an elf or a pixie, & how I imagined the code originally - jumping around, fast and weightless, until it solved. It's small, quick, flexible, and multi-talented.

And I couldn't make 'Tinkerbell' into an acronym.
Q: "What happens to it now?"
A: Well, I hope people use it. Find that paper you don't understand, or don't trust. Try to reproduce its distributions. Look *inside* the numbers.
Real talk - our experience leads us to believe there are a lot of papers in the world with serious problems that no-one has ever noticed before.

I'd like to think this *isn't* because researchers don't care. Bad research obviously bothers most researchers.
However, (a) we often don't have the right tools to investigate these mistakes, and (b) people simply don't know that it's possible to look for them. We hope both of these things change as the error detection toolkit gets bigger.
Final note: if you're going to @improvingpsych this year, @sTeamTraen & I will be presenting a workshop on how to use SPRITE for error detection in our seminar led by the esteemed and dangerous @MicheleNuijten.

Sunday June 24th, 2-5pm. Bring papers you don't trust with you.
Big fat props to co-authors @OmnesResNetwork @Research_Tim @sTeamTraen. This has taken 15 months of everyone's part-time work, and no-one got paid. As usual.
