Staff Research Scientist at @GoogleAI Brain. Previously @DeepMind.
Dec 7, 2019 • 6 tweets • 5 min read
Why do deep ensembles trained with just random initialization work surprisingly well in practice?
In our recent paper arxiv.org/abs/1912.02757 with @stanislavfort & Huiyi Hu, we investigate this by using insights from recent work on loss landscape of neural nets.
More below:
@stanislavfort 2) One hypothesis is that ensembles may lead to different modes while scalable Bayesian methods may sample from a single mode.
We measure the similarity of function (both in weight space and function space) to test this hypothesis.