Tu Vu Profile picture
Ph.D. student @UMassCS, working on natural language processing and deep learning. Former research intern @GoogleAI and @MSFTResearch.
Oct 15, 2021 8 tweets 4 min read
Sharing my internship work @GoogleAI: 1) w/ Soft Prompt Transfer, Prompt Tuning matches or significantly outperforms Model Tuning across model sizes, 2) tasks can help each other via their prompts & task prompts can be used as task embeddings to formalize task similarity.

🧵 1/8 Lester et al. (2021) show that, as model size increases, Prompt Tuning (which learns soft prompts to condition a frozen model to perform tasks) becomes competitive with Model Tuning (a.k.a fine-tuning). However, there are still large gaps between them at small model sizes. 2/8
Sep 15, 2021 9 tweets 4 min read
Excited to announce our #EMNLP2021 paper that shows how to turn a pre-trained language model or even a randomly initialized model into a strong few-shot learner.

Paper: arxiv.org/abs/2109.06270
w/ amazing collaborators: @lmthang, @quocleix, @GradySimon, @MohitIyyer

1/9👇 Despite their strong performance on many tasks, large-scale pre-trained language models do not perform as well when limited labeled data is available (e.g., on small datasets or in few-shot settings). Collecting more labeled data can help but can also be prohibitively expensive.
Nov 15, 2020 12 tweets 6 min read
Excited to share our @emnlp2020 paper on task transferability:

1) a large-scale empirical study w/ over 3,000 combinations of NLP tasks and data regimes within and across different classes of problems

2) task embedding methods to predict task transferability

1/12👇 Transfer learning with large-scale pre-trained language models has become the de-facto standard for state-of-the-art performance on many NLP tasks. Can fine-tuning these models on source tasks other than language modeling further improve target task performance? 🤔