Discover and read the best of Twitter Threads about #reproducibility

Most recents (8)

Today I'd like to talk about issues with respect to #openscience and specifically sharing code.

But first an intro to what open science is...
I am sure you all know a bit about #openscience already but essentially it's a very broad community/movement that aims to make science more accessible, transparent, inclusive, etc.

@daniellecrobins and @rchampieux have this great umbrella infographic!

#Openscience boils down to making science more free and open in the same general ways that the related open source and free software movements/communities pushed for change by making the outputs of science more accessible to both the general public and to other scientists.
Read 58 tweets
Hi modeling tweeps, People have been asking for suggested readings to learn about #modeling in #psychscience. I am creating this thread to collect suggestions, with the explicit purpose to demonstrate the *diversity* of modeling tools and approaches. Please feel invited to
I'll start by adding some suggestions myself. Here a starting point for the discussion of #reproducibility in modeling given by @o_guest with I think the following relevant tags: #cognitivemodeling #theoreticalmodeling Anything to add or correct @o_guest ?
Read 15 tweets
Today at my first #SciData18 conference with @SpringerNature. Today's themes are:
mentoring open science
+
making data findable, accessible, interoperable and reusable throughout the research lifecycle
Data Generalist @becky_boyles

Scientists must store, integrate, analyse, compare + share data sets. Via @TheEconomist, data is the new oil

Or is it the new plastic?

Careful how data used as resource. Closed v shared v open data. Not even 'open data' is truly open #SciData18
Data Generalist @becky_boyles

New model for data -
not sharing data via 'copying' (email, dropbox)
enhanced security where user is both producer and consumer
teams form outside silos
democratic tools for use by non-programmers
integrated data

#SciData18
Read 41 tweets
If you want to make code/data “available”, GitHub isn’t enough.

You must deposit at a DOI-issuing data repository @figshare & @ZENODO_org are both free & awesome; can be synced w/ a GitHub repo

Why GitHub not enough? 1/4
#OpenAccess #OpenData
GitHub is a place for things to be worked on, not for them to live forever.

- Links are fragile (username, repo name)
- Users can delete repos
- GitHub could make your code/data unavailable in the future.

DOI-issuing data repositories preserve your stuff for the future 2/4
Depositing on @KaggleDatasets isn’t good enough for #OpenAccess #OpenData either.

- No API for accessing files without an account
- Fragile URLs
- Kaggle Datasets is a commercial thing.

Do all three! GitHub repo, Kaggle Dataset and @figshare or @ZENODO_ORG 3/4
Read 4 tweets
Five hours in @Reagan_Airport and still here; twice rebooked due to thunderstorms—hope I make it to Boston tonight for tomorrow's IEEE #reproducibility workshop.
As the @IEEEorg steps into the #reproducibility discussion, I'm really hoping they'll pay attention to terminology—"Terminologies for Reproducible Research" arxiv.org/abs/1802.03311
My assessment after reviewing literature from more than a dozen fields is that the predominant usage for #reproducibility is “same data+same methods=same results.”
Read 10 tweets
Fifth and final session on #ResearchIntegrity Brandon Stell on #PubPeer pubpeer.com @FEBSnews
Stell: Scientists are not the only people whose work relies on accuracy of published work - also basis for current and future research, public policy, etc #ResearchMisconduct
Stell: cites the #Poldermans case and how flawed publication that made its way into guidelines led to 8000 deaths
Read 13 tweets
Third #ResearchMisconduct presentation by Bernhard Rupp: The action is in the re(tr)action @FEBSnews #FEBS2018
Rupp breaks with convention and walks away from the podium #WanderingSpeaker
Rupp: valid concerns exist about incorrect and irreproducible research, but is there a "reproducibility crisis"? #ResearchMisconduct
Read 13 tweets
How many random seeds are needed to compare #DeepRL algorithms?

Our new tutorial to address this key issue of #reproducibility in #reinforcementlearning

PDF: arxiv.org/pdf/1806.08295…

Code: github.com/flowersteam/rl…

Blog: openlab-flowers.inria.fr/t/how-many-ran…

#machinelearning #neuralnetworks
Algo1 and Algo2 are two famous #DeepRL algorithms, here tested
on the Half-Cheetah #opengym benchmark.

Many papers in the litterature compare using 4-5 random seeds,
like on this graph which suggests that Algo1 is best.

Is this really the case?
However, more robust statistical tests show there are no differences.

For a very good reason: Algo1 and Algo2 are both the same @openAI baseline
implementation of DDPG, same parameters!

This is what is called a "Type I error" in statistics.
Read 11 tweets

Related hashtags

Did Thread Reader help you today?

Support us! We are indie developers!


This site is made by just three indie developers on a laptop doing marketing, support and development! Read more about the story.

Become a Premium Member ($3.00/month or $30.00/year) and get exclusive features!

Become Premium

Too expensive? Make a small donation by buying us coffee ($5) or help with server cost ($10)

Donate via Paypal Become our Patreon

Thank you for your support!