Profile picture
🏴James Heathers🏴 @jamesheathers
, 10 tweets, 2 min read Read on Twitter
Here's a journal format I'd like to see: a rich, rare or massive dataset published as an integrated, multistage spec. issue.

Paper 1: publish paper describing clean and annotated raw dataset, w. very explicit descriptions of measures. A *really useful*, ready-to-use dataset.
Papers 2 through n: pre-registered projects which present a hypothesis possible of being addressed through the use of the dataset. Could be anything. They could even overlap a bit.

(Naturally, the more utility the dataset has, the more people will preregister to use it.)
If pre-reg is accepted, researchers get the dataset (hooray less work!) and proceed straight to analysis. Paper 1 authors are available for questions on their carefully curated dataset.

All papers submitted by a hard deadline - prevents scooping, competition, first-to-market BS.
Because papers are pre-reg, review is simple. Then the *whole stinkin' lot* can be released at once... papers 2 through n, the dataset itself, the underlying code, bang. Exploratory and follow-up stuff can proceed from there.
Why?

Some people have ideas but no data. Perhaps it's too expensive, requires weird or unusual machines, hard to access populations of people/cells/etc. Perhaps they don't have a grant, and need pilot results.
Some people have data (or apparatus) but no time/resources to look at it. But they don't want to drop their numbers in the public domain and never derive any personal value from it. The data is mothballed 'until they get to it'. Or 'until they can get a student'. etc.
These projects would provide ideal PhD projects for analytically minded students, produce good papers to base grants around or start projects, share resources, and reward people with valuable data reticent to put it in the public domain with publishing big important papers.
It's a way of sharing resources between work groups where everyone wins. It's a platform for making sure under-resourced people and labs (THERE ARE SO MANY) get access to good measurements.

I have no idea if it'd work, or if it exists already. But I'd sure like to do it.
I'll give you an example from biosignals.

If you're interested in cardiorespiratory measures, imagine a perfectly clean dataset: Frank lead ECGs (i.e. 3 axis), respiration, *two* simultaneous measurements of non-invasive BP, AND finger pulse. Baseline and then stress tasks.
Long recording times, great sample size, equipment that engineers and signal analysts won't have - you got a radial applanation tonometry rig in your lab? no? me neither :(
Missing some Tweet in this thread?
You can try to force a refresh.

Like this thread? Get email updates or save it to PDF!

Subscribe to 🏴James Heathers🏴
Profile picture

Get real-time email alerts when new unrolls are available from this author!

This content may be removed anytime!

Twitter may remove this content at anytime, convert it as a PDF, save and print for later use!

Try unrolling a thread yourself!

how to unroll video

1) Follow Thread Reader App on Twitter so you can easily mention us!

2) Go to a Twitter thread (series of Tweets by the same owner) and mention us with a keyword "unroll" @threadreaderapp unroll

You can practice here first or read more on our help page!

Did Thread Reader help you today?

Support us! We are indie developers!


This site is made by just three indie developers on a laptop doing marketing, support and development! Read more about the story.

Become a Premium Member and get exclusive features!

Premium member ($3.00/month or $30.00/year)

Too expensive? Make a small donation by buying us coffee ($5) or help with server cost ($10)

Donate via Paypal Become our Patreon

Thank you for your support!