ACM #Multimedia 2021: Skeleton-Contrastive 3D Action Representation Learning w/ @fmthoker, @doughty_hazel: arxiv.org/abs/2108.03656 We learn invariances to multiple #skeleton representations and introduce various skeleton augmentations via noise contrastive estimation 1/n
Contribution I: leverage multiple input-representations of #3D-#skeleton sequences. Our inter-skeleton contrast learns from a pair of representations in a cross-contrastive fashion. Enriches the sparse input space and focuses on the high-level semantics of the skeleton data. 2/n
Contribution II: three skeleton-specific #augmentations for generating positive pairs which encourage the model to focus on the spatio- temporal #dynamics of skeleton-based @action sequences, ignoring confounding factors such as viewpoint and exact joint positions. 3/n
Our approach achieves state-of-the-art performance for self-supervised learning from #skeleton data on the challenging #PKU and #NTU datasets with multiple downstream tasks, including #action #recognition, #action #retrieval and #semi-#supervised #learning. 4/n
Code is here: github.com/fmthoker/skele… n/n

• • •

Missing some Tweet in this thread? You can try to force a refresh
 

Keep Current with Cees Snoek

Cees Snoek Profile picture

Stay in touch and get notified when new unrolls are available from this author!

Read all threads

This Thread may be Removed Anytime!

PDF

Twitter may remove this content at anytime! Save it as PDF for later use!

Try unrolling a thread yourself!

how to unroll video
  1. Follow @ThreadReaderApp to mention us!

  2. From a Twitter thread mention us with a keyword "unroll"
@threadreaderapp unroll

Practice here first or read more on our help page!

More from @cgmsnoek

27 Feb
#ICLR2021 cam-ready II: "LiftPool: Bidirectional ConvNet Pooling" w/ Jiaojiao Zhao is now available: isis-data.science.uva.nl/cgmsnoek/pub/z… No more lossy down- and upsampling when pooling! 1/n Image
LiftPool adopts the philosophy of the classical #Lifting #Scheme from #signal #processing. LiftDownPool decomposes a feature map into various downsized sub-bands, each of which contains information with different frequencies. Because of its invertible properties, ... 2/n
by performing LiftDownPool backwards, a corresponding up-pooling layer #LiftUpPool is able to generate a refined upsampled feature map using the detail sub-bands, which is useful for #image-#to-#image #translation challenges. 3/n Image
Read 4 tweets

Did Thread Reader help you today?

Support us! We are indie developers!


This site is made by just two indie developers on a laptop doing marketing, support and development! Read more about the story.

Become a Premium Member ($3/month or $30/year) and get exclusive features!

Become Premium

Too expensive? Make a small donation by buying us coffee ($5) or help with server cost ($10)

Donate via Paypal Become our Patreon

Thank you for your support!

Follow Us on Twitter!

:(