Kristy Choi Profile picture
CS @Stanford. Previously CS-Stats @Columbia. Machine Learning, Bayesian statistics, generative models.
Jul 12, 2020 7 tweets 3 min read
Excited to share our #ICML2020 paper on fair generative modeling! We present a scalable approach for mitigating dataset bias in models learned on various datasets without explicit annotations. 👇

w/ @adityagrover_ @_smileyball Trisha Singh @StefanoErmon
arxiv.org/abs/1910.12008 Generative models can be trained on large, unlabeled data sources.

If we naively mix all datasets, a trained model will propagate or amplify the bias in this mixture. On the other hand, labeling all attributes of interest may be impossible or super expensive. (2/7)