TLDR: In its current form, peer review is a poorly defined task with apples-to-oranges comparisons and unrealistic expectations. /1
Reviewers resort to heuristics such as reject-if-not-SOTA to cope with uncertainty, so the only way to change that is to reduce uncertainty. Which is at least partly doable: better paper-reviewer matching, unambiguous eval criteria, fine-grained tracks, better review forms etc /2
Which criteria and forms, exactly? Each field has to find out for itself, through iterative development and experiments. Except that in NLP such work would be hard to publish, so there are no incentives to do it - and no mechanisms to test and compare any solutions. /3
We mostly rant about peer review *after* acceptance notifications come out, but what do we expect to change without systematic work on improving its quality *between* conferences? /4
This is not to say that conference organizers are doing a bad job. But each conference makes a unique set of choices, and we have no way to tell to systematically compare and decide which policies should be kept. Also, this would be a lot of extra work. /5
Actionable steps:
- talk about peer review a lot more, build up prestige of the topic and incentives to work on it
- create new ACL roles to think about systematic testing/implementation of peer review policies, and feedback mechanism between organizers, authors and reviewers /6
TLDR for those who missed the prior discussion: non-anonymous preprints systematically disadvantage the unknown labs and/or underrepresented communities.
My previous post: hackingsemantics.xyz/2020/anonymity/ /1
To summarize both posts, we have the following trade-off for the unknown/underrepresented authors:
* anonymous preprints: better acceptance chance;
* arXiv: lower acceptance chance, but more chances to try to promote unpublished work and get invited for talks and interviews.
/3
* if you missed @coling2020 deadline😉
* if you have any questions: we linked to announcement threads!
* to find folks to follow in your field: we tried to tag all the organizers!
I really enjoyed this episode of #nlphighlights with @earnmyturns. It is about managing industry research teams, but also generally about incentives in research and the need for intellectual diversity.
If hiring decisions are guided by the number of ACL/NeurIPS papers, you will hire essentially the same person over and over again: probably CS background, from a top US school, white, male, with the means to ignore everything for the sake of *ACL deadlines for a few years. /2
With more of the same kind of people, you will be doing incremental improvements to the same thing you're already doing - instead of trying to do smth radically better. That would requires intellectual diversity, so hiring managers should be casting their net wider. /3