Tweet

Andrew Carr

19 Mar, 5 tweets, 2 min read

I'm super excited to share our work on self-supervised learning for audio. We extend the permutation pre-text task by using differentiable ranking and show improved performance on low-resource tasks (it also works great on images and video)

1/

When using permutations in pretraining, a subset of permutations are used to train a classifier which predicts permutations as classes.

However, since there are n! different permutations of length n, it's not feasible to use any reasonable fraction of them for classes.

2/

We fix this problem by using a differentiable ranking objective which allows arbitrary permutations to be used.

By increasing the number of usable permutations, we find improved representations are learned which can be used on downstream tasks.

3/

The paper has nice graphs, cool math, describes a simple algorithm which is quite competitive.

4/

@GoogleAI

The work was done while I was @GoogleAI with an incredible team. @qberthet, @mblondel_ml, Olivier Teboul, and @neilzegh all did amazing work and I learned a ton from them.

Check it out! arxiv.org/abs/2103.09879

5/5

• • •

Missing some Tweet in this thread? You can try to force a refresh

This Thread may be Removed Anytime!

Twitter may remove this content at anytime! Save it as PDF for later use!

Support us! We are indie developers!

This site is made by just two indie developers on a laptop doing marketing, support and development! Read more about the story.

Become a Premium Member ($3/month or $30/year) and get exclusive features!

Become Premium

Too expensive? Make a small donation by buying us coffee ($5) or help with server cost ($10)

Donate via Paypal Become our Patreon

Thank you for your support!

Share this page!

Andrew Carr

Try unrolling a thread yourself!

Did Thread Reader help you today?

Like this author's thread?