Most recents (1)

Leshem Choshen

@LChoshen

During training, your loss goes up and down up and down up and down.

But how would it go if you magically went in a straight line
from init to learnt position?

Apparently smoothly down!

On the surprising Linear Interpolation:
#scientivism #deepRead #MachineLearning

It all started on ICLR2015(!)
@goodfellow_ian @OriolVinyalsML @SaxeLab
Checked points between the converged model and the random initialization.
They found that the loss between them is monotonically decreasing.

Why shouldn't it?
Well... The real question is why should it.

If the loss terrain is anything but a slope, we would expect bumps. Maybe there are different sinks (local minima), or you need to get a bad model before you reach the best model (topologically, you are in a ditch)

Read 22 tweets

Discover and read the best of Twitter Threads about #deepread

Most recents (1)

Related hashtags

Discover and read the best of Twitter Threads about #deepread

Most recents (1)

Related hashtags

Did Thread Reader help you today?