Happy to finally share our paper about differentiable Top-K Learning by Sorting that didn’t make it to #CVPR2022, but was accepted for #ICML2022! We show that you can improve classification by actually considering top-1 + runner-ups… 1/6🧵
Idea: Top-k class accuracy is used in many ML tasks, but training is usually limited to top-1 accuracy (or another k). We propose a differentiable top-k classification loss that allows training by considering any combination of top-k predictions, e.g. top-2 top-5, 3/6🧵
To this end, we leverage recent advances in differentiable sorting and ranking. We capture the probability for a class to be among the top-k given, e.g. an image. 4/6🧵
We evaluate the top-k loss on state-of-the-art architectures. We find that relaxing k does not only produce better top-5 accuracies but also leads to top-1 accuracy improvements and can achieve new state-of-the-art by fine-tuning on publicly available ImageNet models. 6/6🧵
Hope you enjoy the paper! Feel free to leave comments or contact us if you have questions. Code will be available soon!
• • •
Missing some Tweet in this thread? You can try to
force a refresh
Check out our #CVPR2022 paper! We improve multimodal zero-shot text-to-video retrieval on Youcook2/MSR-VTT by leveraging fusion transformer and combinatorial loss. 1/🧵
We propose a multimodal modality agnostic fusion transformer that learns to exchange information between multiple modalities, e.g. video, audio, text, and builds an embedding that aggregates multi-modal information.