Thread Reader
Share this page!
×
Post
Share
Email
Enter URL or ID to Unroll
×
Unroll Thread
You can paste full URL like: https://x.com/threadreaderapp/status/1644127596119195649
or just the ID like: 1644127596119195649
How to get URL link on X (Twitter) App
On the Twitter thread, click on
or
icon on the bottom
Click again on
or
Share Via icon
Click on
Copy Link to Tweet
Paste it above and click "Unroll Thread"!
More info at
Twitter Help
David Page
@dcpage3
Machine learning researcher @nanopore
2 subscribers
Subscribe
Save as PDF
Sep 11, 2019
•
25 tweets
•
8 min read
The paper that introduced Batch Norm
arxiv.org/abs/1502.03167
combines clear intuition with compelling experiments (14x speedup on ImageNet!!)
So why has 'internal covariate shift' remained controversial to this day?
Thread 👇
A recent question on Twitter from
@yoavgo
shows that there's still confusion about how Batch Norm works in practice:
https://twitter.com/yoavgo/status/1169495585084321792
Save as PDF
Jun 20, 2019
•
4 tweets
•
3 min read
New blog post: How does batch norm _really_ help optimisation?
We go on a tour of bad inits, degenerate networks and spiky Hessians - all in a Colab notebook:
colab.research.google.com/github/davidcp…
Summary 👇
1/
Early signs of trouble.
We learn that deep ReLU nets with He-init, but no batch norm, basically ignore their inputs! (Check out
arxiv.org/abs/1902.04942
by Luther,
@SebastianSeung
for background.)
Easy to miss if you pool across channels: