PhD student @mldcmu @SCSatCMU. Foundations of multimodal learning & applications in social AI, NLP, and healthcare with @lpmorency and @rsalakhu.
Apr 15, 2022 • 4 tweets • 7 min read
Are you working on federated learning over heterogeneous data? Use Vision Transformers as a backbone!
In our upcoming #CVPR2022 paper, we perform extensive experiments demonstrating the effectiveness of ViTs for FL:
Using ViTs, we are able to scale FL up to the edge-case of heterogeneity - 6000 & 45000 clients with only 1 sample per client!
Apr 14, 2022 • 7 tweets • 9 min read
[11877 Advanced Topics in Multimodal ML] In week 11, the class formalized a taxonomy of dataset and model biases (social bias, annotator bias, shortcuts, spurious correlations) and proposed solutions to mitigate them in multimodal settings.
[11877 Advanced Topics in Multimodal ML] In week 5’s session, the class aimed to define a taxonomy of multimodal reasoning: the (hierarchical) composition of unimodal and multimodal evidences into higher-level abstract concepts for prediction.
Notes here: cmu-multicomp-lab.github.io/adv-mmml-cours…@mldcmu@LTIatCMU@lpmorency Some suggested papers:
CLEVRER: CoLlision Events for Video REpresentation and Reasoning arxiv.org/abs/1910.01442
Neuro-Symbolic Visual Reasoning: Disentangling "Visual" from "Reasoning" arxiv.org/abs/2006.11524