The pretrain-then-finetune paradigm is a staple of transfer learning but is it always the right way to use auxiliary tasks? In our #ICLR2022 paper openreview.net/forum?id=2bO2x…, we show that in settings where the end-task is known in advance, we can do better.
[1/n] @gneubig@atalwalkar@pmichelX TL;DR, instead of decoupled pretrain-then-finetune, we multitask the end-task with the auxiliary objectives. We use meta-learning to determine end-task and auxiliary task weights. Our approach improves performance and data-efficiency in low-resource settings.
[2/n]