What should ML models do when there's a *perfect* correlation between spurious features and labels?
This is hard b/c the problem is fundamentally _underdefined_
DivDis can solve this problem by learning multiple diverse solutions & then disambiguating arxiv.org/abs/2202.03418
🧵
Prior works have made progress on robustness to spurious features but also have important weaknesses:
- They can't handle perfect/complete correlations
- They often need labeled data from the target distr. for hparam tuning
DivDis can address both challenges, using 2 stages: 1. The Diversify stage learns multiple functions that minimize training error but have differing predictions on unlabeled target data 2. The Disambiguate stage uses a few active queries to identify the correct function
I'm super excited about DivDis for a few reasons.
First, it can start to address underspecified problems with perfect spurious correlations, with mild assumptions.
It can also combat simplicity bias when the spurious feature is much simpler than the core feature
Second, it yields good performance even when hyperparameters are tuned on held-out data from the training distribution
Third, it conceptually addresses a problem that Bayesian NNs & ensembles struggle with.
By leveraging unlabeled data from the target distribution (the transductive setting), it can cover the space of relevant solutions much more effectively.
Finally, this was also a problem I was puzzled by a year ago, and it's awesome to have an initial solution to the puzzle. :)
2/ Student feedback is a fundamental problem in scaling education.
Providing good feedback is hard: existing approaches provide canned responses, cryptic error messages, or simply provide the answer.
3/ Providing feedback is also hard for ML: not a ton of data, teachers frequently change their assignments, and student solutions are open-ended and long-tailed.
Supervised learning doesn’t work. We weren’t sure if this problem can even be solved using ML.
To get reward functions that generalize, we train domain-agnostic video discriminators (DVD) with:
* a lot of diverse human data, and
* a narrow & small amount of robot demos
The idea is super simple: predict if two videos are performing the same task or not.
(2/5)
This discriminator can be used as a reward by feeding in a human video of the desired task and a video of the robot’s behavior.
We use it by planning with a learned visual dynamics model.
(3/5)
To think about this question, we first look at how equivariances are represented in neural nets.
They can be seen as certain weight-sharing & weight-sparsity patterns. For example, consider convolutions.
(2/8)
We reparametrize a weight matrix into a sharing matrix & underlying filter parameters
It turns out this can provably represent any equivariant structure + filter parameters, for all group-equivariant convolutions with finite groups.
(3/8)