Discover and read the best of Twitter Threads about #naacl2022

Most recents (11)

📢 A 🧵 on the Trends in NLP Datasets.

What’s changed since SQuAD was all the rage in 2016? A: A LOT. 🔭

1. Generic ➡️ Niche Tasks
2. Task-specific Training+Eval ➡️ Eval Only
3. Dataset ➡️ Benchmark ➡️ Massive Collections
4. Datasets ➡️ Diagnostics

1/
What started as a trickle became an explosion of NLP datasets over the last few years.

@sebastian ruder used to track all NLP sets on his website: nlpprogress.com. It’s no longer possible to keep up-to-date.

2/
🌟 Trend 1 🌟 Generic dataset are replaced with more niche datasets.

⏳ Before: datasets released for general tasks.

⌛️ Now: We see tasks targeting hyper-specific abilities.

Exs:

3/
Read 13 tweets
Thinking back to Batya Friedman (of UW's @TechPolicyLab and Value Sensitive Design Lab)'s great keynote at #NAACL2022. She ended with some really valuable ideas for going forward, in these slides:

Here, I really appreciated 3 "Think outside the AI/ML box".

>> Screenshot of slide, with t...
As societies and as scientific communities, we are surely better served by exploring multiple paths rather than piling all resources (funding, researcher time & ingenuity) on MOAR DATA, MOAR COMPUTE! Friedman points out that this is *environmentally* urgent as well.

>>
Where above she draws on the lessons of nuclear power (what other robust sources of non-fossil energy would we have now, if we'd spread our search more broadly back then?) here she draws on the lessons of plastics: they are key for some use case (esp medical). >> Screenshot of slide, with t...
Read 7 tweets
1/n My Industry panel main takeaways at #NAACL2022
Skills/Tips for Students:
* "find your strength [something you can do better than others]...and double down on that"
* Be resourceful and be curious
* Have a learning mindset
2/n Book recommendations for success:
* Some times you just have to through [an experience]
Work life balance:
* Burnout is not good, make time for what makes you happy.
* Set boundaries e.g. email only during the week days.
* Avoid comparing yourself to others.
3/n Creating a Network:
* Attend conferences and talk to ppl, ask questions.
* Be memorable -- send an email e.g. commenting on their work, sending related work etc.
* Attend conferences with your labmates/advisor they can introduce you to people they know.
Read 5 tweets
Happy to share our #NAACL2022 paper: “Masked Part-Of-Speech Model: Does Modeling Long Context Help Unsupervised POS-tagging?”

🐭MPoSM models long-term & bidirectional tag dependency.

arxiv.org/abs/2206.14969

w/ @byryuer @mohitban47

Join us in Seattle (oral session6B July12)
🧵 Image
Previous Part-Of-Speech (POS) induction models usually assume certain independence assumptions (e.g., Markov, unidirectional, local dependency) that do not hold in real languages. For example, the subject-verb agreement can be both long-term and bidirectional. Image
Our Masked Part-Of-Speech Model (🐭MPoSM (pronounced as m-possum)) is inspired by masked language modeling. It has 2 parts: a Local POS Prediction module, and a Masked POS Reconstruction module. Through the reconstruction objective, it models arbitrary tag dependencies. Image
Read 8 tweets
Want a captioning system to describe images in more detail & grammatically, but existing caption annotations are not fine-grained?

Check our #NAACL2022 Findings paper “Fine-grained Image Captioning with CLIP Reward”!

arxiv.org/abs/2205.13115

@AdobeResearch @uncnlp

🧵👇
(1/n)
Toward more descriptive and distinctive caption generation, we propose using CLIP to calculate multimodal similarity and use it as a reward function. This avoids imitating only the reference caption and instead transfers fine-grained details from similar training images.

(2/n)
We found that using CLIP-S (@jmhessel etal) as reward provides such fine-grained guidance; but we also found that the model trained with it degenerates with repeated words. Since CLIP is trained only with a contrastive objective, its text encoder doesn't care about grammar

(3/n)
Read 8 tweets
- How good is GPT-3 at generating human-acceptable free-text explanations?
- Can we produce good explanations with few-shot prompting?
- Can human preference modeling produce even better explanations from GPT-3?

We answer these questions and more in our #NAACL2022 paper. 🧵1/12
“Reframing Human-AI Collaboration for Generating Free-Text Explanations” with @jmhessel @Swabha @mark_riedl @YejinChoinka

Paper: arxiv.org/abs/2112.08674
Code/data: github.com/allenai/few_sh…
We investigate the role high-quality prompts play in the quality of generated free-text explanations. Using explanations we write in the prompt greatly increases crowdsourced preferences for GPT-3 explanations over (human-written) explanations in existing datasets.
Read 13 tweets
Tired of beam search and all the heuristics needed to make it work well in MT?
In our work accepted at #NAACL2022 (co-lead @tozefarinhas) we explore an alternative decoding method that leverages neural metrics to produce better translations!

arxiv.org/abs/2205.00978

1/14 Image
The most common method to obtain translations from a trained MT model is to approximately compute the *maximum-a-posteriori* (MAP) translation with algorithms like beam search

However many works have questioned the utility of likelihood as a proxy for translation quality.

2/14 Image
In parallel, significant progress has been made recently in improving methods for Quality Estimation (QE) and evaluation of translated sentences by using pretrained LMs, with metrics such BLEURT or COMET(-QE) achieving high correlations with human judgments of quality.

3/14
Read 14 tweets
🥳 New paper accepted at #NAACL2022 (Main) 🥳

NLP tasks like hate speech detection are subjective: annotators disagree about what the correct data labels are. We propose two contrasting paradigms to enable better data annotation.

arxiv.org/pdf/2112.07475…

⬇️ Highlights below ⬇️
⚠️ We argue that dataset creators should consider annotator subjectivity in the annotation process and either explicitly encourage it or discourage it, depending on the intended use of their dataset ⚠️
As a framework, we propose two contrasting data annotation paradigms:
1️⃣ The descriptive paradigm encourages annotator subjectivity to create datasets as granular surveys of individual beliefs
2️⃣ The prescriptive paradigm discourages subjectivity and instead tasks annotators with encoding one specific belief, formulated in the annotation guidelines
Read 9 tweets
How far can we get by training Text-to-SQL models without any annotated SQL? Pretty far, it seems!

Our new work in Findings of #NAACL2022 with @JonathanBerant and Daniel Deutch

Paper: arxiv.org/abs/2112.06311
Code: github.com/tomerwolgithub…

🧵 1/5
Text-to-SQL models should help non-experts easily query databases. But annotating examples to train them requires expertise (labeling NL utterances with SQL queries).

Can we train good enough models without any expert annotations?

2/5
Instead of gold SQL, we train text-to-SQL models on weak supervision: (1) answers & (2) question decompositions (annotated / predicted by a model) ⛏️

3/5
Read 5 tweets
1/7 I am thrilled to announce Aspire, a new method for scientific document similarity from my internship with @allen_ai to be presented at #NAACL2022!

📰Paper: arxiv.org/abs/2111.08366
👾🐱Code, data, HF models: github.com/allenai/aspire

A TLDR of our contribs:
2/7 Scientific papers generally consist of many elements - so, we built a method which represents these aspects of papers and uses them to compute similarity between scientific papers. Figure denoting the multiple aspects of a scientific paper:
3/7 We opt for a simple document representation and represent a paper at the sentence level with a contextual encoder. Now, training a model for similarity between the different aspects of a paper is challenging since there is rarely any training data of aspect level similarity.
Read 7 tweets
I'm very excited and proud to announce that my team at Meta AI, with our collaborators, will have a strong presence in NAACL 2022 with 8 accepted papers on summarization, question answering and retrieval technologies.#nlproc #ai #NAACL2022 (see papers in follow-up tweets)
[Summarization and QA] Simple Local Attentions Remain Competitive for Long-Context Tasks: lnkd.in/gEPz2Ytz
[QA and Retrieval] CCQA: A New Web-Scale Question Answering Dataset for Model Pre-Training: lnkd.in/gbBV8_Rd
Read 9 tweets

Related hashtags

Did Thread Reader help you today?

Support us! We are indie developers!


This site is made by just two indie developers on a laptop doing marketing, support and development! Read more about the story.

Become a Premium Member ($3.00/month or $30.00/year) and get exclusive features!

Become Premium

Too expensive? Make a small donation by buying us coffee ($5) or help with server cost ($10)

Donate via Paypal Become our Patreon

Thank you for your support!