, 17 tweets, 7 min read
My Authors
Read all threads
* BING BONG *

A new BMJ EDITORIAL from my team.

WHY RESEARCHERS SHOULD SHARE THEIR ANALYTIC CODE

Non-paywall link below.

NOW this is VERY IMPORTANT stuff for safety, efficiency, reproducibility and quality in research, data science, and evidence…

bmj.com/content/367/bm…
Our BMJ editorial is pegged to a recently retracted trial in @JAMA_current

The researchers found, to their horror, that they had made a catatrophic error in their code - their data analysis scripts - for processing and analysing the data in their RCT...

(2/n)
@JAMA_current Because of this error in the researchers' analytic scripts, the control group and the intervention group ended up being accidentally switched in their dataset. The results of the trial were almost completely reversed. This is BAD. 3/n
Now, it's great that the error was eventually spotted, and fixed. But none of the commentary has alluded to the MUCH bigger issue raised by this case.

Specifically: why don't we make researchers share their analytic code, always, as a matter of routine?

4/n
It's completely bizarre that we allow people to issue waffly narrative descriptions of how they analysed their data, when they can just share the actual scripts that did all the actual work. So we can look for errors, learn, re-use, etc.

5/n
To further emphasise the value of sharing code: this catastrophic error wasn't even the only error in the scripts for the retracted RCT. (Paragraph 3 of our editorial)

6/n

bmj.com/content/367/bm…
Why doesn't this already happen? It's partly history.

Some of the arguments we have received from academics about why they SHOULDN'T be asked to share their code are very, very extraordinary.

Please read paragraph 4. Please, I beg you...

bmj.com/content/367/bm…
One group told us they couldn't share their code because the scripts are long, covering “many pages of information.”

This is very odd. There are endless free open platforms to share code. GitHub has a limit of ONE HUNDRED GIGABYTES for each repository..

bmj.com/content/367/bm…
For context, my group’s @openprescribing is a vast software project with 130,000 users a year. It's over 30 000 lines of code, at least one order of magnitude bigger than any single epidemiological analysis script, but this equates to... only 1.5 MB of storage.
@openprescribing The reluctance of epidemiologists and electronic health record researchers to share their code is holding back research: it's inefficient, outdated, and it breeds error.

Let me tell you what we are doing in my small group to try to fix this problem...

bmj.com/content/367/bm…
Firstly, at @EBMDataLab we try to lead by example.

On GitHub under open licenses we have shared 44,000 lines of code in 34 public repositories, over 5,000 commits, 850 python files, 105 SQL files, 4,600 lines of SQL, 140 Jupyter notebooks.

Come, see!

github.com/ebmdatalab
Our “issue trackers” are open to all, with 260 open issues and 745 closed ones, all containing a permanent public record of our technical discussions around barriers and solutions, all discoverable in Google. Our code has been widely reviewed and re-used in the community. ...
Now. Just doing the right thing is not enough. You must ADVOCATE to get others to change. And you must TEACH and SHARE to help others who want to change.

On advocacy, we have been sharing our open methods, informally, through blogs...

ebmdatalab.net/openness-and-t…
We will shortly launch some courses on open analytics, and we aim to share lots of teaching materials online, to help researchers do the right thing. I am sad that £40 million on the Farr Institutes, National Health Informatics Institutes, produced approximately nothing on this.
So in the coming months we will launch a strategic, coordinated push for open analytics in healthcare.

This editorial is the beginning.

Too many epidemiologists are trapped in the past. Patients and taxpayers are suffering. Funders must wake up and act.

bmj.com/content/367/bm…
If you would like to join us on this Open Analytics initiative then get in touch as ever, my email is in my twitter profile, we always want allies!

That is the end of the nerd rage for today.

Here is a funny video of animals slipping around on the ice.

I agree training is good. We are happy to help deliver / develop / share this teaching if @HDR_UK @UKRI and those with funds to do work at scale will support it. However as we explain in the editorial, we are also keen on people sharing "good enough" code

Missing some Tweet in this thread? You can try to force a refresh.

Enjoying this thread?

Keep Current with ben goldacre

Profile picture

Stay in touch and get notified when new unrolls are available from this author!

Read all threads

This Thread may be Removed Anytime!

Twitter may remove this content at anytime, convert it as a PDF, save and print for later use!

Try unrolling a thread yourself!

how to unroll video

1) Follow Thread Reader App on Twitter so you can easily mention us!

2) Go to a Twitter thread (series of Tweets by the same owner) and mention us with a keyword "unroll" @threadreaderapp unroll

You can practice here first or read more on our help page!

Follow Us on Twitter!

Did Thread Reader help you today?

Support us! We are indie developers!


This site is made by just three indie developers on a laptop doing marketing, support and development! Read more about the story.

Become a Premium Member ($3.00/month or $30.00/year) and get exclusive features!

Become Premium

Too expensive? Make a small donation by buying us coffee ($5) or help with server cost ($10)

Donate via Paypal Become our Patreon

Thank you for your support!