Loss Landscape of Neural Networks: theoretical insights and practical implications

EPFL Virtual Symposium – 15-16 February 2022

In practical applications, Deep Neural Networks are typically trained by walking down the loss surface using gradient descent augmented with a bag of tricks. One of the important practical insights has been that large, overparameterized, networks that have more parameters than necessary work better  –  one potential interpretation (but not the only one) is the ‘lottery ticket hypothesis’. Obviously, the shape of the loss landscape is important when walking down. In recent years, research on the shape of the loss landscape has addressed questions such as: “Is there one big global minimum or many scattered small ones?”, “Is the loss landscape rough or smooth?”,  “Should we worry about saddle points?”, “Are there flat regions in the loss?”, “How many saddle points are there?”. While these look like questions for theoreticians, their answers might have practical consequences and lead to a better understanding of the role of overparameterization, pruning, and reasons of the bag of tricks. The aim of this workshop is to bring together researchers that have worked on these topics from different points of views and different backgrounds (Computer Science, Math, Physics, Computational Neuroscience), and build a community around these questions.


Giulio Biroli, LPENS




Joan Bruna, NYU




Jonathan Frankle, MIT




Surya Ganguli, Stanford University




Arthur Jacot, EPFL




Marco Mondelli, IST Austria




Hanie Sedghi, Google




Berfin Simsek, EPFL




Andrew Gordon Wilson, NYU




Zhi-Qin John Xu, SJTU Shanghai



Wulfram Gerstner, School of Life Sciences AND School of Computer and
Communication Sciences, EPFL

Clément Hongler, Institute of Mathematics, School of Basic Sciences, EPFL

All times in GMT +1 / Swiss time = 08:30-13:30 US East Coast

Day 1Tuesday 15 FebruaryPresentation title
Session 1
Chair: Volkan Cevher

14:25 – 14:30Wulfram GerstnerIntroduction
14:30 – 15:00Joan BrunaOn Shallow Neural Network Optimization Landscapes via Average Homotopy Analysis
Discussion (10 min)
15:10 – 15:40Jonathan FrankleUnderstanding Loss Landscapes through Neural Network Sparsity
Discussion (10 min)
15:50 – 16:20Berfin SimsekThe Geometry of Neural Network Landscapes: Symmetry-Induced Saddles & Global Minima Manifold
Discussion (10 min)
16:30 – 17:05BREAK
Session 2
Chair: Johanni Brea

17:10 – 17:40 Zhi-Qin John XuEmbedding Principle of Loss Landscape
Discussion (10 min)
17:50 – 18:20Hanie Sedghi / Rahim EntezariThe Role of Permutation Invariance in Linear Mode Connectivity of Neural Networks
Discussion (10 min)
18:30 – 19:30Discussion (Q&A with audience)


Day 2


Wednesday 16 February


Presentation Title

Session 3


Chair: Nicolas Flammarion

14:30 – 15:00Surya GanguliFrom the geometry of high dimensional energy landscapes to optimal annealing in a dissipative many body quantum optimizer
Discussion (10 min)
15:10 – 15:40Marco MondelliLandscape Connectivity in Deep Neural Networks: Mean-field and Beyond
Discussion (10 min)
15:50 – 16:20Arthur JacotRegimes of Training in DNNs: a Loss Landscape Perspective
Discussion (10 min)
16:30 – 17:05BREAK

Session 4


Chair : Clement Hongler

17:10 – 17:40Giulio BiroliBad minima, good minima, and rare minima of the loss landscape
Discussion (10 min)
17:50 – 18:20Andrew Gordon WilsonUnderstanding Loss Valleys for Practical Bayesian Deep Learning
Discussion (10 min)
18:30 – 19:30Discussion (closed amongst speakers and session chairs)

This symposium can be taken for course credit. 

BIO-642 : State of the Art Topics in Neuroscience XIII