Are you working on federated learning over heterogeneous data? Use Vision Transformers as a backbone!
In our upcoming #CVPR2022 paper, we perform extensive experiments demonstrating the effectiveness of ViTs for FL:
Using ViTs, we are able to scale FL up to the edge-case of heterogeneity - 6000 & 45000 clients with only 1 sample per client!
@vickyqu0@yuyinzhou_cs@mldcmu@StanfordDBDS@StanfordAILab By virtue of their robustness and generalization properties, ViTs also converge faster with fewer communicated parameters, which makes them appealing for efficient FL.
ViTs can be used with optimization FL methods (FedProx, FedAvg-Share) to further improve speed & performance.
[11877 Advanced Topics in Multimodal ML] In week 11, the class formalized a taxonomy of dataset and model biases (social bias, annotator bias, shortcuts, spurious correlations) and proposed solutions to mitigate them in multimodal settings.
[11877 Advanced Topics in Multimodal ML] In week 5’s session, the class aimed to define a taxonomy of multimodal reasoning: the (hierarchical) composition of unimodal and multimodal evidences into higher-level abstract concepts for prediction.
Notes here: cmu-multicomp-lab.github.io/adv-mmml-cours…