12,399 views

Stephen Mayhew @ EMNLP2019

@mayhewsw

, 19 tweets, 11 min read

My Authors

Chris Manning keynote: multi-step reasoning for answering complex questions. #conll2019 #emnlp2019

Manning: Lehnert (1977) says that NLU can be measured by asking questions. #conll2019 #emnp2019

Manning: SQuAD leaderboard performance (read: BERT) is impressive, but hard to say these models exhibit true understanding. #conll2019 #emnp2019

Manning: thinking and reasoning are missing ingredients. #conll2019 #emnlp2019

Manning: progress in deep learning should involve more traditional AI goals, such as reasoning. #conll2019 #emnp2019

Manning: GQA dataset: natural language questions generated from graphs taken from the visual genome project. Large gap from human performance. cs.stanford.edu/people/dorarad… #conll2019 #emnp2019

(computer crashed here... missed some sections, sorry)

Manning: instead of dividing VQA into V or QA, let's model the "language of thought", or concepts. Method is attention over *concepts* via neural state machines (NSM) (NeurIPS 2019 paper). #conll2019 #emnlp2019

@jhuclsp

@jhuclsp

Manning: concepts are represented by disentangled attributes/properties (reminds me of semantic decomposition work from @jhuclsp). Entities are nodes in the NSM, properties are edges. #conll2019 #emnlp2019

Manning: convert question into a sequence of instructions, induce an NSM from image, then apply these instructions to the NSM. #conll2019 #emnlp2019

Manning: it gets good scores! #conll2019 #emnlp2019

Manning: and the official summary (of the first part). #conll2019 #emnlp2019

@qi2peng2

@qi2peng2

Manning: Part II, back to textual QA. Content is from this paper (at EMNLP, by @qi2peng2 and others): nlp.stanford.edu/pubs/qi2019ans… Again: Multi-step reasoning is the focus #conll2019 #emnlp2019

Manning: they use HotpotQA dataset, as it encourages 2-step reasoning, and also requires provenance for responses. #conll2019 #emnlp2019

Manning: chaining questions are most common, for example (mine) "who gave the second keynote at the only workshop that spanned two days at EMNLP 2019?" #conll2019 #emnlp2019

Manning: standard one-step QA systems don't work so well. New idea: use IR system to create "silver standard" set, and run standard QA over retrieved documents. Codename: GoldEn Retriever #conll2019 #emnlp2019

Manning: and... it works pretty well. High in the leaderboard despite having no BERT. #conll2019 #emnlp2019

Manning: big picture summary. References Kahnemann “Thinking Fast and Slow”. Most work has been “fast” (pattern matching), lets do “slow” (thinking and reasoning) #conll2019 #emnlp2019