Paul Liang Profile picture
PhD student @mldcmu @SCSatCMU. Foundations of multimodal learning & applications in social AI, NLP, and healthcare with @lpmorency and @rsalakhu.
Apr 15, 2022 4 tweets 7 min read
Are you working on federated learning over heterogeneous data? Use Vision Transformers as a backbone!
In our upcoming #CVPR2022 paper, we perform extensive experiments demonstrating the effectiveness of ViTs for FL:

paper: arxiv.org/abs/2106.06047
code: github.com/Liangqiong/ViT… @vickyqu0 @yuyinzhou_cs @mldcmu @StanfordDBDS @StanfordAILab We find that ViTs are more robust to distribution shift, reduce catastrophic forgetting over devices, accelerate convergence, and reach better models.

Using ViTs, we are able to scale FL up to the edge-case of heterogeneity - 6000 & 45000 clients with only 1 sample per client!
Apr 14, 2022 7 tweets 9 min read
[11877 Advanced Topics in Multimodal ML] In week 11, the class formalized a taxonomy of dataset and model biases (social bias, annotator bias, shortcuts, spurious correlations) and proposed solutions to mitigate them in multimodal settings.

Notes here: cmu-multicomp-lab.github.io/adv-mmml-cours… @lpmorency @LTIatCMU @mldcmu Some suggested papers:
Shortcut learning in deep neural networks nature.com/articles/s4225…
Measuring Social Biases in Grounded Vision and Language Embeddings aclanthology.org/2021.naacl-mai…
Multimodal datasets: misogyny, pornography, and malignant stereotypes arxiv.org/abs/2110.01963
Mar 3, 2022 7 tweets 8 min read
[11877 Advanced Topics in Multimodal ML] In week 5’s session, the class aimed to define a taxonomy of multimodal reasoning: the (hierarchical) composition of unimodal and multimodal evidences into higher-level abstract concepts for prediction.
Notes here: cmu-multicomp-lab.github.io/adv-mmml-cours… @mldcmu @LTIatCMU @lpmorency Some suggested papers:
CLEVRER: CoLlision Events for Video REpresentation and Reasoning arxiv.org/abs/1910.01442
Neuro-Symbolic Visual Reasoning: Disentangling "Visual" from "Reasoning" arxiv.org/abs/2006.11524