Yoav: Personally don’t work on huge data. If some company would like to train a huge LM on the entire web, that’d be great to have and analyze.
Ryan: Creating datasets and data annotation is not sexy. Community does not value it.
Audience: Web is much smaller in non-English world. Data is enclosed in social networks, not easily accessible.
Audience: Models that people are developing don’t scale very well to web-scale data.
Meg: Representations do well if future is similar / has same distribution as the past. Does not generalize well to different future events.
Audience: Should we focus more on stateful or stateless representations?
Yoav: Should focus on both.
Audience: Representations don't really outperform a bag-of-sentences, can't really model a narrative, underlying structure.
Yejin: A lot of focus on sentence-level tasks, which receive a lot of attention. Some challenging tasks that go beyond the sentence-level, e.g. NarrativeQA.
Yoav: SQuAD is the MNIST of NLP.
Graham: Nice work on incorporating discourse structure and on incorporating coref structures. Don't be afraid to do something slightly more complicated than an LSTM.
Yejin: We found discourse-level structure to be useful, slightly outperform LSTMs. Structure, however, mainly captured cross-sentence entity relations
Meg: We're not working with neurotransmitters. Definitely things we can learn from humans, but should not try to exactly replicate human process of language learning.
Yejin: Groundedness matters a lot, but perhaps more important is ability to abstract, learn about the world, observe it. Children build a model of the world, ask questions about it and refine the model.
Yejin: Do not need to confine ourselves to psycholinguistics but can look for broad inspirations.
Audience: Should some of our systems focus on more constrained worlds, e.g. BlocksWorld so that we can focus on certain aspects, e.g. reasoning?
Meg: Isolating particular kinds of instances is useful. Very clear dependent and independent variables is useful to have.
Audience: Allow scope for experimentation with simple, formulaic language, not just open-domain language.
Yoav: They're suboptimal. They're simple and silly. In 10 years, we'll have different models.
Yejin: Same problem in computer vision. Convnets work very well.
Audience: Doing really well on current metrics. A while since last winter/bust. People tend to forget that there are booms and busts. Gives rise to unrealistic expectations.
Graham: A lot of papers examine this. Hype is mostly generated by corporate press releases.
Yoav: Not the right crowd to complain to about hype.
Audience: Will just building better models, building empirical models without theory enable language understanding?
Ndapa: People also appreciate empirical work.
Yejin (being controversial): Academia will survive even in an empirical world. Creative work will always be there and being creative does not require a lot of GPUs.
Graphical models had a lot of theory.
Audience: Recent paper criticizes ML scholarship.
Yoav: Empirical work can be good or bad. Not many interesting papers in NLP make use of massive amounts of compute.
Fin.