The machine learning research community is very and very vibrant.
Here is what I mean...🧵🧵
In 1958, Frank Rosenblatt invented a perceptron, a very simple algorithm that would later turn out to be the core and origin of to days intelligent machines.
In essence, the perceptron is a simple binary classifier that can determine whether or not a given input belongs to a specific class.
Here is the algorithm of perceptron:
Rosenblatt's intent was not to develop a perceptron as an algorithm or software, but rather a machine.
The perceptron was implemented in hardware that got the name of Mark I Perceptron.
The weights were encoded in potentiometers, and weight updates were done by electric motors.
Here is Mark I Perceptron.
A perceptron or a simple artificial neural network architecture led to multilayer perceptrons(MLP) and later inspired other types of neural networks such as convolutional neural networks and recurrent neural networks.
Let's shift a little bit to convolutional neural networks.
Convolutional neural networks are the type of neural nets that have succeeded in image recognition tasks.
They are also used in text classification and time series analysis, but their major applications lie in visual recognition tasks.
One of the first Convnets architecture is LeNet-5 which was introduced by @ylecun.
LeNet-5 was designed for handwritten and machine-printed character recognition.
Here is a 1998 paper that used LeNet-5 in text recognition. The first paper that used LeNet-5 was in 1989.
Other Convnets architectures that followed include GoogleLeNet by @ChrSzegedy(won imagenet 2014 and inspired Inception v-3 and v-4), VGG or Visual Geometry Group (that won Imagenet runner-up 2014), and ResNet that won the 2015 challenge.
There are other Convnets architectures that followed but most of them were modifications of previous architectures.
For example, Xception or Xtreme Inception by @fchollet was a version of GoogLeNet that used separable convolution layers instead of inception modules.
Now turning to recurrent networks.
Recurrent neural networks were created to handle sequential data such as texts, audio, and time series.
The first simple RNN cells failed to handle long-term sequences due to the problem of vanishing gradients, and Long Short Term Memories(LSTMs) were introduced to overcome that.
Gated Recurrent Units(GRUs) which is a simple version of LTSMs were later introduced too.
Precision: What is the percentage of positive predictions that are actually positive?
Recall: What is the percentage of actual positives that were predicted correctly?
The fewer false positives, the higher the precision. Vice-versa.
The fewer false negatives, the higher the recall. Vice-versa.
How do you increase precision? Reduce false positives.
It can depend on the problem, but generally, that might mean fixing the labels of those negative samples(being predicted as positives) or adding more of them in the training data.
It starts with a high-level overview of the model/technique being covered and then continues with the implementation.
And wherever possible, there are visuals to support the concepts.
Here is an outline of what you will find there:
PART 1 - Intro to Programming and Working with Data
â—†Intro to Python for Machine Learning
â—†Data Computation With NumPy
â—†Data Manipulation with Pandas
â—†Data Visualization
â—†Real EDA and Data Preparation
â—†A quick look into the dataset
â—†Summary statistics
â—†Finding the basic information about the dataset
â—†Checking missing data
â—†Checking feature correlations
The key differences between shallow learning and deep learning models:
Shallow learning models:
â—† Most of them are simple and require less hyper-parametrization
â—† They need the features to be pre-extracted
â—† They are best suited for tabular datasets
â—† Their architectural changes are very limited.
â—† They don't require huge computation resources
â—† Their results are interpretable than deep learning models
â—† Because of the limit in their design change, there are little researches going on in these models.
Example of shallow learning models:
â—†Linear and logistic regression
â—†Support vector machines
â—†Decision trees
â—†Random forests
â—†K-Nearest neighbors