Ph.D. student @UMassCS, working on natural language processing and deep learning. Former research intern @GoogleAI and @MSFTResearch.
Oct 15, 2021 • 8 tweets • 4 min read
Sharing my internship work @GoogleAI: 1) w/ Soft Prompt Transfer, Prompt Tuning matches or significantly outperforms Model Tuning across model sizes, 2) tasks can help each other via their prompts & task prompts can be used as task embeddings to formalize task similarity.
🧵 1/8
Lester et al. (2021) show that, as model size increases, Prompt Tuning (which learns soft prompts to condition a frozen model to perform tasks) becomes competitive with Model Tuning (a.k.a fine-tuning). However, there are still large gaps between them at small model sizes. 2/8
Sep 15, 2021 • 9 tweets • 4 min read
Excited to announce our #EMNLP2021 paper that shows how to turn a pre-trained language model or even a randomly initialized model into a strong few-shot learner.
1/9👇
Despite their strong performance on many tasks, large-scale pre-trained language models do not perform as well when limited labeled data is available (e.g., on small datasets or in few-shot settings). Collecting more labeled data can help but can also be prohibitively expensive.
Nov 15, 2020 • 12 tweets • 6 min read
Excited to share our @emnlp2020 paper on task transferability:
1) a large-scale empirical study w/ over 3,000 combinations of NLP tasks and data regimes within and across different classes of problems
2) task embedding methods to predict task transferability
Transfer learning with large-scale pre-trained language models has become the de-facto standard for state-of-the-art performance on many NLP tasks. Can fine-tuning these models on source tasks other than language modeling further improve target task performance? 🤔