Skip connections are a common feature in modern CNN architectures. They create an alternative path for the gradient to flow through, which can help the model learn faster.
In a neural network, the gradient measures how much a change in one part of the network affects the output. We use the gradient to update the network during training to recognize data patterns better.
Here's the lowdown: the output of a neural network is calculated with the ππ½ weights ππ½ of the edges that connect the nodes in the network.
So, you gotta find the optimal values of weights to minimize the final error on training data examples.
1οΈβ£ We start by assigning random values to all weights in the network
2οΈβ£ Then, for every input sample, we perform a feedforward operation to calculate the final output and the prediction error.