Profile picture
Stephanie Hyland @_hylandSL
, 19 tweets, 8 min read Read on Twitter
panel at the workshop on critiquing and correcting trends in ML research is starting in room 511 #NeurIPS2018. Panelists: Rich Caruana, @hannawallach, Finale Doshi-Velez, @suchisaria, @RandomlyWalking and @zacharylipton
Q: what are the top troubling trends for the near future?
A: groupthink as we all read the same papers, large companies with massive PR machines, the belief that good test set performance means you can just go ahead and deploy models in the wild #NeurIPS2018
Doshi-Velez: the issue is not so much about scholarship, publication issues and so on. It's how we interface with the real world. We're not trained to do this, we're not a professional profession.
Caruana: mistakes in real-world ML will hurt people. #NeurIPS2018
Q: what about "hype"?
A: Wallach & Sutton: press releases before *and* after publication are problematic: they shape how the broader community views our research.
Caruana: healthcare, a traditionally conservative field, is starting to believe DL hype! Concerning. #NeurIPS2018
Saria: we as a community hold ourselves to a very low bar as to what it means to "solve a problem". In other fields, it's a *big* effort, but in ML we seem to have a culture of flag planting (Doshi-Velez: "licking the icing on the cupcake so nobody else can have it")#NeurIPS2018
Q: how do we deal with double blind review and arXiv?
A: originally, #NeurIPS2018 was *not* blinded. Now we have a weird situation where it's *sort* of blinded. Wallach doesn't advocate getting rid of double blinding: even if imperfect, seeing "anonymous authors" is powerful
Sutton on preprints as prior work: think about what the researchers would have known about when they *started* the work #NeurIPS2018
Q: sharing review information (e.g. ICLR's open review system)
A: good if it helps the reviewing load, bad if old reviews bias future reviewers, e.g. if the paper has been updated substantially, but reviewers read old review... #NeurIPS2018
Saria: the discussion in the review process is also a scholarly endeavour. How do we highlight these valuable discussions and high quality reviews? #NeurIPS2018
Q: how about paying reviewers?
Saria: the issue for reviewers is not *money* so much as *time*
Sutton: money makes it easier to hold people to account for their reviewing
Wallach: the issue is we don't have *enough* reviewers for the number of submissions #NeurIPS2018
Doshi-Velez: grad students are common reviewers these days, even though they previously would not have been considered qualified. But they tend to pick apart papers very carefully, while more senior reviewers can see context/novelty better. #NeurIPS2018
Q: should we make code publication mandatory?
Wallach: if we go this direction (for "reproducibility"), do we also mandate data sharing? This will exclude many people in fields such as medicine (me: 🙋), social science, etc. #NeurIPS2018
Suggestion from Saria: some of these questions need longer term discussion and decision-making, beyond the annual format of the #NeurIPS2018 board.
Wallach: next year the board will meet every 2 weeks (!), but these decisions also need larger involvement
Doshi-Velez: smaller venues have more freedom to experiment with different models (about review, code requirements, etc). The results could then percolate up to #NeurIPS2018
Going back to code release, Caruana points out that there are papers where the *computational* requirements make it nigh-impossible to reproduce.

Q: should such papers be rejected?
Caruana: papers are rejected for using private data, how different is that? #NeurIPS2018
Sutton: papers can also contain *ideas* which have value beyond the specific results, so even if the work is technically non-reproducible it may still be worth publishing #NeurIPS2018
Questions from the audience: what are the *downsides* to requiring code release? #NeurIPS2018
Comment/question from the audience: there are two rounds of review.
First, "is this work appropriate for a conference?"
Second, "does the community think this work is interesting/worth citing?".
Shared code aids in the second round.
#NeurIPS2018
Final question: action. What do we do?
Doshi-Velez: breakout sessions at a workshop on e.g.what does a good paper look like?
Sutton: organise! Find people who care about these issues, write whitepapers, make recommendations
Tom: get in touch with workshop organisers
#NeurIPS2018
Missing some Tweet in this thread?
You can try to force a refresh.

Like this thread? Get email updates or save it to PDF!

Subscribe to Stephanie Hyland
Profile picture

Get real-time email alerts when new unrolls are available from this author!

This content may be removed anytime!

Twitter may remove this content at anytime, convert it as a PDF, save and print for later use!

Try unrolling a thread yourself!

how to unroll video

1) Follow Thread Reader App on Twitter so you can easily mention us!

2) Go to a Twitter thread (series of Tweets by the same owner) and mention us with a keyword "unroll" @threadreaderapp unroll

You can practice here first or read more on our help page!

Did Thread Reader help you today?

Support us! We are indie developers!


This site is made by just three indie developers on a laptop doing marketing, support and development! Read more about the story.

Become a Premium Member and get exclusive features!

Premium member ($30.00/year)

Too expensive? Make a small donation by buying us coffee ($5) or help with server cost ($10)

Donate via Paypal Become our Patreon

Thank you for your support!