Q: If I implement a paper, there are likely lots of implementations of that already existing. How do I make it worthwhile?
⬇️
1> I think the learning aspect should precede any other aspect in this regard. Whether or not it's gonna be worthwhile shouldn't matter if you are up for the learning challenge.
2> But if you want to make the implementation a part of your project portfolio, the following things could be helpful.
2.1> You could pick up papers that are a bit off the grid from the conventional ones while still being in your territory. You should also enjoy working on it.
2.2> Another direction could be to come up with experiments that showcase some unique properties of the underlying technique/model. Maybe challenge some of the things present in the original implementation. This shows maturity and involvement.
Would love to know how other people go about this.
• • •
Missing some Tweet in this thread? You can try to
force a refresh
Implementing a paper is helpful in so many ways. Get to
* Know the work inside out including the implementation details.
* Study amazing resources to further your understanding.
* Read a lot of code for references. Sometimes, the official codebases are amazing.
1/
Oftentimes, an idea seems fairly simple but when it comes to implementation details, things start to get messier. This is the learning, folks!
If the original impl. is messy, you might be able to make it elegant, simpler, and in turn, better.
2/
For me, implementing existing works has helped me become a better practitioner and also a better believer. It's almost always never easy but that's the real fun. It boosts your confidence and also your knowledge.
3/
We provide standalone scripts and also notebooks for training and testing our models. We open-source all the experimental results and pre-trained models:
Recipes that I find to be beneficial when working in low-data/imbalance regimes (vision):
* Use a weighted loss function &/or focal loss.
* Either use simpler/shallower models or use models that are known to work well in these cases. Ex: SimCLRV2, Big Transfer, DINO, etc.
1/n
* Use MixUp or CutMix in the augmentation pipeline to relax the space of marginals.
* Ensure a certain percent of minority class data is always present during each mini-batch. In @TensorFlow, this can be done using `rejection_resampling`.
* Use semi-supervised learning recipes that combine the benefits of self-supervision and few-shot learning. Ex: PAWS by @facebookai.
* Use of SWA is generally advised for better generalization but its use in these regimes is particularly useful.
3/n
New #Keras example is up on *consistency regularization*or an important recipe for semi-supervised learning and tackling distribution shifts as shown in *Noisy Student Training*.
This example provides a template for performing semi-supervised / weakly supervised learning. A few things one can plug right in:
* Incorporate more data while training the student.
* Filter the high-confidence predictions while training the student.
2/n
The example uses Stochastic Weight Averaging during training the teacher to induce geometric ensembling. With elements like Stochastic Dropout, the performance might even be better.
If you use @TensorFlow in your work moderately, I think you already have the prerequisites. Definitely take the *TensorFlow in Practice* specialization by @lmoroney & @DeepLearningAI_. It will get you up to speed.
Study the contents rigorously.
Review the certificate handbook carefully. It really has all the information you need to know about the certification - tensorflow.org/extras/cert/TF….
* Install @pycharm & get sufficiently comfortable with it.
* Set up the exam environment properly.