Profile picture
Catherine Wheller @catinthefield
, 41 tweets, 34 min read Read on Twitter
Today at my first #SciData18 conference with @SpringerNature. Today's themes are:
mentoring open science
making data findable, accessible, interoperable and reusable throughout the research lifecycle
Data Generalist @becky_boyles

Scientists must store, integrate, analyse, compare + share data sets. Via @TheEconomist, data is the new oil

Or is it the new plastic?

Careful how data used as resource. Closed v shared v open data. Not even 'open data' is truly open #SciData18
Data Generalist @becky_boyles

New model for data -
not sharing data via 'copying' (email, dropbox)
enhanced security where user is both producer and consumer
teams form outside silos
democratic tools for use by non-programmers
integrated data

Data Generalist @becky_boyles

Link to NIH Data Commons Pilot Phase Consortium

Making data Findable, Accessible, Interoperable, and Reusable (FAIR)

Data Steward Coordinator @martateperek @tudelft

Institution appoints #datastewards at all faculties. They understand researchers and their problems. They have a PhD - so understand the pain points. There to help and improve data culture.

Aside - I didn't understand data structure at my research institution. Training was given in large groups, not tailored, and I didn't reach out further. Probably missed opportunities. Would have been great to have proactive, individual data mentorship.

#SciData18 @martateperek
Great initiatives at institutions:

Cambridge Data Champions…

TU Delft Data Stewards… (they're hiring!…)

@ResPlat at @unimelb do a great job too!

#SciData18 @martateperek
Claudia Wolff | Coastal Risks and Sea-Level Rise @WolffClaudi

Problem: missing data!

Old approach - manually digitising coastline through @googleearth - tedious!

New approach - citizen science!

Takeya Adachi | Japan. Agency for Med Res. + Dev.

Problem: ultra-rare disease & undiag. diseases overlooked

1. survey undiag. patients in Japan
2. local network
3. global data exchange (clinical + genetic)
4. patient + public involvement…

Andrew Tatum | WorldPop @WorldPopProject

Problem: Bad maps

Approach: Disaggregated blocky maps to aggregated, sharable, interactive usable grids

Not just static anymore - can use changing data for policy - training!

Natalia Tejedor Garavito | WorldPop @NatuTejedor @WorldPopProject

Problem: Need to create maps that can be compared across countries

Approach: Collecting data, using data + combining datasets
Code share:

Sophie Adler | MELD Project @sophieadler

Problem: finding evidence of epilepsy characteristics in MRI

Approach: deep learning - train legion classifier

open access journal
data repository

resulted in external replication and validation!

Jane Seymour | Nursing + Midwifery @sheffielduni

Problem: Gaining consent to use end-of-life care data

Data sharing + informed consent: do they really understand?
Layered consent model - cooperation btw ethics, family etc
Data is being used!

James Avery | Hyper acute stroke unit

Problem: sharing + accessing health data

Why share data? Find the solution to your exact data online! Don't say 'contact author for data' just put it up!

Sarala Wimalaratne @EMBLEBI

Problem: Inconsistent identifiers for life sciences - too many urls!

Approach: Actionable compact identifiers!

prefix:identifier (e.g. taxon:9606)

location independent

Nick DeVito @NDevito1 @EBMDataLab

Problem: trial transparency - reporting info completely, accurately + timely

Approach: website that tracks breaches of trial reporting laws

'Hey, this trial report is overdue!'

Carsten Kettner | STRENDA-DB @CarKettner @BeilsteinInst

Problem: missing + imprecise information

Approach: define guidelines; check data for compliance with guidelines

Completeness; Compliance; Registration; Quality -> better data!

Andrej-Nikolai Spiess |

Problem: How to reproduce stat significance w/o raw data?

Approach: digitise graphs; re-analyse data; look for 'reversers'
What proportion of data points need to be removed to change significance?
Concerning amount...

Aliaksandr Yakutovich | @EPFL_en

Problem: Screen large number of materials, reproducibility of computational research; track provenance


Helena Cousijn | @datacite @HelenaCousijn

Problem: Developing data metrics; standardise data reuse

Approach: infrastructure to count data reuse (citations, views, downloads); display data metrics


Alasdair Rae @undertheraedar @sheffielduni


Anyone can find + read your research! It's nice + good + the right thing to do!
You don't know who might be interested!

See AR's projects here:

Great way to finish the lightning talks:

By sharing your data you open yourself up to the good and the bad - but open data is a risk worth taking

The Wellcome Data Re-use Prizes! @apDinsmore…

Data shared by researchers can be re-used by others to generate new insights and tools

Competition data:
1. AMR Surveillance data!Synapse:syn1…
2. Malaria!Synapse:syn1…

Magdalena Skipper | Editor in Chief of @nature @Magda_Skipper

#stateofopendata survey - reviewing the development of the open data movement over the past 10 years

Credit, concerns, awareness, reuse, guidelines, motivations


Magdalena Skipper | Editor in Chief of @nature @Magda_Skipper

Only 10% of authors cite datasets used in research papers properly

Nature Research journals mandate a data availability statement + recommend public repositories - very supportive of open data + training

Magdalena Skipper | Editor in Chief of @nature @Magda_Skipper

Data sharing polices e.g.…

Research data support…
1. Can you share your data?
2. Are your data ready to share?
3. Who is the owner of your data?

Magdalena Skipper | Editor in Chief of @nature @Magda_Skipper

Not just biomed data! Here's a resource for the Earth scientists (my people):… - develop standards in the Earth sciences to enable FAIR data on a large scale

Magdalena Skipper | Editor in Chief of @nature @Magda_Skipper

@CodeOceanHQ - allow reviewers access to the exact same environment that the authors used in the paper. A great leap into code reproducibility! (wow!)

John Burn-Murdoch | Data Viz @FinancialTimes @jburnmurdoch

Deep desire to make arguments - communication first, not visualisation first.

Focus on the story!

Effective Data Vis: Find the meaningful between the beautiful and clinical.

John Burn-Murdoch | Data Viz @FinancialTimes @jburnmurdoch

“You have to be like the worst tabloid newspaper in the front and the Academy of Science in the back.” - Hans Rosling

Examples of Hans Rosling data vis:…

John Burn-Murdoch | Data Viz @FinancialTimes @jburnmurdoch

Data vis paper!… [pdf]

Effectiveness of data vis:

1. Do you recognise this graphic?
2. Image blurred - what do you remember the graphic telling you?

Text is a key part of graphics!

John Burn-Murdoch | Data Viz @FinancialTimes @jburnmurdoch

Active titles - write take home message in the title

Say it again! Use data and text to explain message

Exploit colour (remember colour blind design…)

Labels - emphasis on narrative

John Burn-Murdoch | Data Viz @FinancialTimes @jburnmurdoch

Aside - I'm keen on communicating meaningful stories in data + this is a field I'd like to explore. If anyone would like 2 chat to a beginner about tools + skills 2 start with, pls let me know!

Panel discussion: Responsibility of Reproducibility

Kirstie Whitaker @kirstie_j

Sue Fletcher-Watson @SueReviews

Paola Quattroni @PaolaQuattroni

Natalia Tejedor-Garavito @NatuTejedor

Zaheer-Ud-Din Babar

Computer Battery dying, let's see how we go!

How do we encourage sharing of metadata?

Integrate data cleaning from the start!

If you can't share sensitive data, share metadata!

Incentives for ppl who do it correctly - though recognise huge time commitment. Diversify meaning of sucess in academia!

How do we encourage sharing of data that didn't support hypothesis?

Funders + institutes provide incentives for academics
Supervisors to let go of the ego + have trust in why result not supported
Commit to publish every protocol

How can we encourage reproducibility studies? Who funds them? Are we in a #reproducibilitycrisis?

Funders - harvest grant scheme for reproducing studies
Give students reproducibility studies rather than new projects -
Less emphasis on novelty, more on quality!

Who should be responsible for long-term archiving research data?

Institutional level + Government level (perhaps the same as institution, as unis funded by government)

What studies should be reproduced?

Not everything needs to be reproduced - some studies provoke a line of enquiry. Lots of qualitative studies fall here.

(Definitions: Reproducibility
1. original data + validating study
2. new, similar data + get same results)

Your study shouldn't be too good to be above checking - @kirstie_j

Debate! Yes, but doesn't say that it *should have* to be reproduced.

Who is responsible for reproducing work?

You should be able to have a career checking someone else's work - @kirstie_j

And Fin.

5% battery left.

Brilliant interactive conference! Diversity in topics, gender, geographic backgroud, career level, data origin, opinions, universities...and more I'm sure.

Well done all involved!

Missing some Tweet in this thread?
You can try to force a refresh.

Like this thread? Get email updates or save it to PDF!

Subscribe to Catherine Wheller
Profile picture

Get real-time email alerts when new unrolls are available from this author!

This content may be removed anytime!

Twitter may remove this content at anytime, convert it as a PDF, save and print for later use!

Try unrolling a thread yourself!

how to unroll video

1) Follow Thread Reader App on Twitter so you can easily mention us!

2) Go to a Twitter thread (series of Tweets by the same owner) and mention us with a keyword "unroll" @threadreaderapp unroll

You can practice here first or read more on our help page!

Did Thread Reader help you today?

Support us! We are indie developers!

This site is made by just three indie developers on a laptop doing marketing, support and development! Read more about the story.

Become a Premium Member and get exclusive features!

Premium member ($30.00/year)

Too expensive? Make a small donation by buying us coffee ($5) or help with server cost ($10)

Donate via Paypal Become our Patreon

Thank you for your support!