Matt Gardner Profile picture
Researcher at Scaled Cognition. Formerly at Semantic Machines, @allenai (@ai2_allennlp, #nlphighlights).
Jun 12, 2021 22 tweets 4 min read
Here's some more context on my arguments about the ethics of crowdsourcing, as I don't think we are all operating under the same set of facts. I have read the papers that are cited in this section on crowdsourcing. I'm very skeptical of the numbers that are presented there. First, I would venture to say that I am probably one of the ones who has used crowdsourcing the most in NLP recently. I have created some 10 relatively large scale datasets on mechanical turk in the last few years. My experience does not match those reported numbers at all.
May 23, 2020 7 tweets 2 min read
This shows, once again, the problem of conflating a format with a phenomenon (not talking about Graham specifically here, but the field as a whole). Taking two sentences and classifying them is a format that permits arbitrary scope. That this format got conflated with the semantic notion of entailment as a whole is a (collective) mistake that has caused a *ton* of confusion about what the capabilities of any particular trained model should be.
Apr 7, 2020 18 tweets 9 min read
Evaluating NLP Models via Contrast Sets

New work that is a collaboration between 26 people at 10 institutions (!)

arxiv.org/abs/2004.02709

Trying to tag everyone at the top of the thread, here it goes: @yoavartzi, Victoria Basmova, @JonathanBerant, @ben_bogin, @soshsihao, @pdasigi, @ddua17, @yanaiela, Ananth Gottumukkala, @nitish_gup, @HannaHajishirzi, @gabriel_ilharco
Mar 3, 2020 12 tweets 3 min read
This was an interesting paper to read. It's well-written, and the method that's used clearly works very well. A few things struck me as I read: First, it struck me how simple the method is. They're using an inner-product search to retrieve encoded documents, and then passing the retrieved document to some end task, doing a very shallow approximation of marginalizing over the retrieval. That's it.
May 14, 2019 7 tweets 1 min read
Thus begins my tri-annual brooding on why we have gatekeepers in between our papers and their intended audience. And I say this as someone whose papers mostly made it past the gatekeepers. I really don't see the point. The impact of your work long term doesn't depend on which stamps of approval it got from the gatekeepers, it depends on how useful the community as a whole finds your contributions to be.
Mar 4, 2019 14 tweets 5 min read
Announcing DROP, a new reading comprehension benchmark that requires discrete reasoning over paragraphs of text. New @NAACLHLT paper by @ddua17, @yizhongwyz, @pdasigi, @GabiStanovsky, @sameer_, and me. allennlp.org/drop.html arxiv.org/abs/1903.00161 I am super excited about this; I've been thinking about this for over a year, and we finally decided to pursue it as our first collaboration between AI2 Irvine and the UCI NLP group. This is a hard dataset that uses complex questions to test comprehensive understanding.