Precision: What is the percentage of positive predictions that are actually positive?
Recall: What is the percentage of actual positives that were predicted correctly?
The fewer false positives, the higher the precision. Vice-versa.
The fewer false negatives, the higher the recall. Vice-versa.
How do you increase precision? Reduce false positives.
It can depend on the problem, but generally, that might mean fixing the labels of those negative samples(being predicted as positives) or adding more of them in the training data.
How do you increase recall? Reduce false negatives.
Fix the labels of positives samples that are being classified as negatives when they are not, or add more samples to the training data.
What happens when I increase precision? I will hurt recall.
There is a tradeoff between them. Increasing one can reduce the other.
What does it mean when the precision of your classifier is 1?
False positives are 0.
Your classifier is smart about not classifying negative samples as positives.
What's about recall being 1?
False negatives are 0.
Your classifier is smart about not classifying positive samples as negatives.
What if the precision and recall are both 1? You have a totally perfect classifier. This is ideal!
What is a better way to know the performance of the classifier without playing a battle of balancing precision and recall?
Combine them. Find their harmonic mean. If either precision or recall is low, the resulting mean will be low too.
Such harmonic mean is called the F1 Score and it is a reliable metric to use when we are dealing with imbalanced datasets.
If your dataset is balanced(positive samples are equal to negative samples in the training set), ordinary accuracy is enough.
◆A quick look into the dataset
◆Summary statistics
◆Finding the basic information about the dataset
◆Checking missing data
◆Checking feature correlations
The key differences between shallow learning and deep learning models:
Shallow learning models:
◆ Most of them are simple and require less hyper-parametrization
◆ They need the features to be pre-extracted
◆ They are best suited for tabular datasets
◆ Their architectural changes are very limited.
◆ They don't require huge computation resources
◆ Their results are interpretable than deep learning models
◆ Because of the limit in their design change, there are little researches going on in these models.
Example of shallow learning models:
◆Linear and logistic regression
◆Support vector machines
◆Decision trees
◆Random forests
◆K-Nearest neighbors