Work on memorization and variance of gradients (VoG) shows that hard examples are learnt later in training, and that learning rates impact what is learnt.
One of the reasons the model matters is because notions of fairness often coincide with how underrepresented features are treated.
Treatment of the long-tail appears to depend on many factors, including memorization bit.ly/3qnru3v, capacity and objective.
So, let's dissuade ourselves of the incorrect notion that the model is independent from considerations of algorithmic bias.
This simply isn't the case. Our choices around model architecture, hyper-parameters and objective functions all inform considerations of algorithmic bias.
These were a few quick examples -- but plenty of important scholarship I have not included in this thread -- including important work on the relationship between robustness and fairness. A welcome invite to add on additional work which is considering these important trade-offs.
• • •
Missing some Tweet in this thread? You can try to
force a refresh
At face value, deep neural network pruning appears to promise you can (almost) have it all — remove the majority of weights with minimal degradation to top-1 accuracy. In this work, we explore this trade-off by asking whether certain classes are disproportionately impacted.
We find that pruning is better described as "selective brain damage" -- performance on a tiny subset of classes and images is cannibalized in order to preserve overall performance. The interesting part is what makes certain images more likely to be forgotten...