The combination of data augmentation and transfer learning plays it well, and you can always try them.
I meant trying because machine learning is experimentation science. There is no single technique/model that is guaranteed to work on all possible problems.
This is the end!
To summarize:
When working with small data, data augmentation and transfer learning can boost the results.
Data augmentation expands the dataset, and transfer learning lets you use the models that someone else trained on the big dataset.
One more thing:
Even if you have enough data to solve a given problem, data augmentation is still worth trying.
Remember: it introduces diversity in the training set, and that is always a good thing. It is a cure for overfitting.
Thanks for reading!
If you have found this thread helpful, share it with anyone who you think would like to learn more about those techniques.
I also appreciate retweets. It is certainly the best way to help the post reach many people.
Neural networks are hard to train. The more they go deeper, the more they are likely to suffer from unstable gradients.
Gradients can either explode or vanish, and either of those can cause the network to give poor results.
A short thread on the neuralnets training issues
The vanishing gradients problem results in the network taking too long to train(learning will be very slow), and the exploding gradients cause the gradients to be very large.
Although those problems are nearly inevitable, the choice of activation function can reduce their effects.
Using ReLU activation in the first layers can help avoid vanishing gradients.
Careful weight initialization can also help, but ReLU is by far the good fix.
Machine learning is the science of teaching the computer to do certain tasks, where instead of hardcoding it, we give it the data that contains what we want to achieve, and its job is to learn from such data to find the patterns that map what we want to achieve and provided data.
These patterns or (learned) rules can be used to make predictions on unseen data.
A machine learning model is nothing other than a mathematical function whose coefficient and intercept hold the best (or learned) values representing the provided data & what we want to achieve.
In ML terms, coefficients are weights, intercepts are biases.
Getting started with machine learning can be hard.
We are fortunate to have many & freely available learning resources, but most of them won't help because they skip the fundamentals or start with moonshots.
This is a thread on learning machine learning & structured resources.
1. Get excited first
The first step to learning a hard topic is to get excited.
Machine learning is a demanding field and it will take time to start understanding concepts & connecting things.
If you find it hard to understand what ML really is,
@lmoroney I/O 19 talk will get you excited. He introduces what machine learning really is from a programming perspective.