Excited to share our new #neurips2020 paper /Deep learning versus kernel learning: an empirical study of loss landscape geometry and the time evolution of the Neural Tangent Kernel/ (arxiv.org/abs/2010.15110) with @KDziugaite, Mansheej, @SKharaghani, @roydanroy, @SuryaGanguli 1/6 Image
We Taylor-expand Deep Neural Network logits with respect to their weights at different stages of training & study how well a linearized network trains based on at which epoch it was expanded. Early expansions train poorly, but even slightly into training they do very well! 2/6 Image
Linearized DNNs underperform compared to even low learning rate trained nonlinear networks, but only for expansions /very early/ in training. We call this the *nonlinear advantage* and show that it disappears quickly into training. 3/6 Image
Surprisingly, the nonlinear advantage a DNN enjoys over its linearized counterpart seems to correlate well with the error barrier (=instability in arxiv.org/abs/1912.05671 @jefrankle @mcarbin) between 2 NNs trained from that point, connecting two very different concepts. 4/6 Image
We find that many other DNN measures such as function space distance (similar to arxiv.org/abs/1912.02757 @balajiln), kernel distance, logit gradient similarity (similar to arxiv.org/abs/1910.05929) and others correlate in a similar manner. 5/6 Image
It seems that the importance of the nonlinear nature of deep neural nets is crucial at the beginning of training, but diminishes relatively soon after that, where Taylor expanded DNNs (even 1st order, in the ∞-width regime=Neural Tangent Kernel) perform almost equally well. 6/6
This was joint work with an amazing team of @KDziugaite @mansiege @SKharaghani, @roydanroy, @SuryaGanguli

• • •

Missing some Tweet in this thread? You can try to force a refresh
 

Keep Current with Stanislav Fort

Stanislav Fort Profile picture

Stay in touch and get notified when new unrolls are available from this author!

Read all threads

This Thread may be Removed Anytime!

PDF

Twitter may remove this content at anytime! Save it as PDF for later use!

Try unrolling a thread yourself!

how to unroll video
  1. Follow @ThreadReaderApp to mention us!

  2. From a Twitter thread mention us with a keyword "unroll"
@threadreaderapp unroll

Practice here first or read more on our help page!

Did Thread Reader help you today?

Support us! We are indie developers!


This site is made by just two indie developers on a laptop doing marketing, support and development! Read more about the story.

Become a Premium Member ($3/month or $30/year) and get exclusive features!

Become Premium

Too expensive? Make a small donation by buying us coffee ($5) or help with server cost ($10)

Donate via Paypal Become our Patreon

Thank you for your support!

Follow Us on Twitter!