How to add new classes to your ML model? πππ... π?
You have a large multi-class NN in production.
You discover a new important class and want to add support for it *quickly* and with *low* risk.
Example: traffic signs recognition for self-driving cars ππ
Thread π
The naive approach π€·ββοΈ
Collect examples of the new class (for example a new traffic sign), label them and retrain the whole NN.
β It will probably work
β It will be time consuming, especially for big models.
β Risk for unintended regressions
Freezing the first layers π₯Ά
Typical CNNs learn generic image features in the initial layers and they will likely apply to the new sign as well.
You can freeze the weights of the initial layers and only retrain the last fully connected layer(s).
β Faster retrain, because only the last layers are updated
β Lower risk for regressions
β For signs that look very differently, the CNN may lack the required features, leading to poor performance
β Still some risk for regression, because we touch all classes.
Extracting high-level features βοΈβͺοΈ6β£
Train the NN to extract high-level features suitable for traffic signs, like shape, color, text or digits.
Define rules to classify each sign based on these features.
β No need to retrain
β Low risk because classes can be separated well
β Will work well for speed limits
β Won't work for some sign categories, like for example warning signs that only differ by their pictogram
β Won't work for signs with new shapes
One-shot learning πβΆοΈπ’
Transform sign images into a latent space, where they can be compared. Each class image is then described by a feature vector.
Siamese NNs can trained to transform images of the same class to very similar feature vectors in a latent space.
How it works?
1β£ Choose examples for the known classes and precompute their feature vectors
2β£ Transform new images to the latent space and find the best match to a known class
3β£ When a new class is found, simply add an example image to the set of known classes.
β No retrain needed
β Low risk for regression
β Only few examples of the new class needed
β May not work well if the new sign is very different and not supported well by the encoder
And finally some examples of traffic signs that you may not think exist until you encounter them... π
β’ β’ β’
Missing some Tweet in this thread? You can try to
force a refresh
What are typical challenges when training a deep neural networks βοΈ
βͺοΈ Overfitting
βͺοΈ Underfitting
βͺοΈ Lack of training data
βͺοΈ Vanishing gradients
βͺοΈ Exploding gradients
βͺοΈ Dead ReLUs
βͺοΈ Network architecture design
βͺοΈ Hyperparameter tuning
How to solve them π
Overfitting π
Your model performs well during training, but poorly during test.
Possible solutions:
- Reduce the size of your model
- Add more data
- Increase dropout
- Stop the training early
- Add regularization to your loss
- Decrease batch size
Underfitting π
You model performs poorly both during training and test.
Possible solutions:
- Increase the size of your model
- Add more data
- Train for a longer time
- Start with a pre-trained network
Self-driving car engineer roles - Big Data Engineer π½
Self-driving cars have lots of cameras, lidars and radars. Waymo currently has 29 cameras on a single vehicle! The cars generate huge amounts of data, easily more than 1 GB/s. This data needs to be processed...
Thread π
Problems to work on π€
The big data engineer needs to design and implement efficient storage and data processing pipelines to handle such large amounts of data.
The data also needs to be made available to the developers in a way that they can efficiently get to what they need.
Data πΎ
Imagine that the self-driving car is recording data at a rate of 1 GB/s. Going on a test drive for 4 hours means that you'll collect more than 14 TB of data!
There are specialized loggers that can handle such rates, like this beast for example: vigem.de/en/content/proβ¦