In this work we tackle a complex multi-modal problem of referring video segmentation -- segmenting an object in a video given its textual description. (2/n)
Mar 26, 2021 • 16 tweets • 5 min read
Our new paper, C2D (arxiv.org/abs/2103.13646, github.com/ContrastToDivi…) shows how self-supervised pre-training boosts learning with noisy labels, achieves SOTA performance and provides in-depth analysis. Authors @evgeniyzhe@ChaimBaskin Avi Mendelson, Alex Bronstein, @orlitany 1/n
The problem of learning with noisy labels (LNL) has great practical importance: large clean dataset is often expensive or impossible to get. Existing approaches to LNL either modify the loss to account for the noise or try to detect the noisy-labelled samples. 2/n