Ravi Kiran S Profile picture
Associate Prof, IIIT-H, India | Alum: UW, IISc | Enjoy working on interdisciplinary problems involving multimedia | Meme fan | 🇮🇳

Feb 3, 2021, 7 tweets

📢 Introducing SynSE, a language-guided approach for generalized zero shot learning of pose-based action representations! Great effort by @bublaasaur and @divyanshu1709 #actionrecognition

Paper: arxiv.org/abs/2101.11530…
Code: github.com/skelemoa/synse…

🧵👇

For enabling compositional generalization to novel action-object combinations, the action description is transformed into individual Part-of-Speech based embeddings.

The PoS-based embeddings are aligned with action sequence embedding via a VAE-based generative space. This alignment is optimized using within and cross modality constraints.

The default ZSL paradigm is biased towards seen classes. We use the elegant gating approach by Atzmon&co. for Generalized ZSL. Essentially, we learn a binary classifier which distinguishes between seen and unseen class samples.

SynSE obtains state of the art ZSL and GZSL performance on the large-scale NTU-RGBD skeleton action dataset.

📣 JPoSE (mwray.github.io/FGAR/), CADA-VAE (github.com/edgarschnfld/C…) which inspired our work. JPoSE: alignment of per-PoS language embedding with visual counterpart but in non-generative setting. CADA-VAE: visuo-lingual alignment in VAE-based setting, but no PoS-awareness.

For details, read our paper arxiv.org/abs/2101.11530… and browse our code, pre-trained models at github.com/skelemoa/synse… 🌟

Share this Scrolly Tale with your friends.

A Scrolly Tale is a new way to read Twitter threads with a more visually immersive experience.
Discover more beautiful Scrolly Tales like this.

Keep scrolling