EPFL Virtual Symposium – 15-16 February 2022
In practical applications, Deep Neural Networks are typically trained by walking down the loss surface using gradient descent augmented with a bag of tricks. One of the important practical insights has been that large, overparameterized, networks that have more parameters than necessary work better – one potential interpretation (but not the only one) is the ‘lottery ticket hypothesis’. Obviously, the shape of the loss landscape is important when walking down. In recent years, research on the shape of the loss landscape has addressed questions such as: “Is there one big global minimum or many scattered small ones?”, “Is the loss landscape rough or smooth?”, “Should we worry about saddle points?”, “Are there flat regions in the loss?”, “How many saddle points are there?”. While these look like questions for theoreticians, their answers might have practical consequences and lead to a better understanding of the role of overparameterization, pruning, and reasons of the bag of tricks. The aim of this workshop is to bring together researchers that have worked on these topics from different points of views and different backgrounds (Computer Science, Math, Physics, Computational Neuroscience), and build a community around these questions.
Giulio Biroli, LPENS |
Joan Bruna, NYU |
Jonathan Frankle, MIT |
Surya Ganguli, Stanford University |
Arthur Jacot, EPFL |
Marco Mondelli, IST Austria |
Hanie Sedghi, Google |
Berfin Simsek, EPFL |
Andrew Gordon Wilson, NYU |
Zhi-Qin John Xu, SJTU Shanghai |
- Volkan Cevher, EPFL
- Johanni Brea, EPFL
- Clément Hongler, EPFL
- Nicolas Flammarion, EPFL
Wulfram Gerstner, School of Life Sciences AND School of Computer and
Communication Sciences, EPFL
Clément Hongler, Institute of Mathematics, School of Basic Sciences, EPFL
All times in GMT +1 / Swiss time = 08:30-13:30 US East Coast
Day 1 | Tuesday 15 February | Presentation title |
---|---|---|
Session 1 | Chair: Volkan Cevher | |
14:25 – 14:30 | Wulfram Gerstner | Introduction |
14:30 – 15:00 | Joan Bruna | On Shallow Neural Network Optimization Landscapes via Average Homotopy Analysis |
Discussion (10 min) | ||
15:10 – 15:40 | Jonathan Frankle | Understanding Loss Landscapes through Neural Network Sparsity |
Discussion (10 min) | ||
15:50 – 16:20 | Berfin Simsek | The Geometry of Neural Network Landscapes: Symmetry-Induced Saddles & Global Minima Manifold |
Discussion (10 min) | ||
16:30 – 17:05 | BREAK | |
Session 2 | Chair: Johanni Brea | |
17:10 – 17:40 | Zhi-Qin John Xu | Embedding Principle of Loss Landscape |
Discussion (10 min) | ||
17:50 – 18:20 | Hanie Sedghi / Rahim Entezari | The Role of Permutation Invariance in Linear Mode Connectivity of Neural Networks |
Discussion (10 min) | ||
18:30 – 19:30 | Discussion (Q&A with audience) |
Day 2 | Wednesday 16 February | Presentation Title |
---|---|---|
Session 3 | Chair: Nicolas Flammarion | |
14:30 – 15:00 | Surya Ganguli | From the geometry of high dimensional energy landscapes to optimal annealing in a dissipative many body quantum optimizer |
Discussion (10 min) | ||
15:10 – 15:40 | Marco Mondelli | Landscape Connectivity in Deep Neural Networks: Mean-field and Beyond |
Discussion (10 min) | ||
15:50 – 16:20 | Arthur Jacot | Regimes of Training in DNNs: a Loss Landscape Perspective |
Discussion (10 min) | ||
16:30 – 17:05 | BREAK | |
Session 4 | Chair : Clement Hongler | |
17:10 – 17:40 | Giulio Biroli | Bad minima, good minima, and rare minima of the loss landscape |
Discussion (10 min) | ||
17:50 – 18:20 | Andrew Gordon Wilson | Understanding Loss Valleys for Practical Bayesian Deep Learning |
Discussion (10 min) | ||
18:30 – 19:30 | Discussion (closed amongst speakers and session chairs) |
This symposium can be taken for course credit.