Swiss Data Science Center (SDSC) ‒ LIONS ‐ EPFL

Current Project:

Rethinking Optimization for Reinforcement Learning (Luca Viano)

While promising automated solutions beyond human performance to many real-world tasks, including continuous control, robotics, and autonomous driving, reinforcement learning (RL) and its variations, such as inverse RL, imitation learning, and behavioral cloning, appear to be extremely fragile to mismatches in practice. Existing optimization approaches are motivated from the naive tabular settings and cannot extend to the contemporary neural network representations as underlying formulations are non-convex and non-concave minimax optimization problems blocked by hardness results. To this end, we propose a paradigm shift in how we handle RL and its variants via how we set up problems with neural networks for continuous state and action spaces, how to exploit new key structures in minimax problems, such as weak Minty inequalities, with new algorithms such as adaptive double time-scale extragradient methods to overcome hardness results, and how to exploit new universal second order methods in order to close the gap between convergence rate of algorithms (i.e., numerical efficiency) vs. their sample efficiency, which is often more critical for RL. We contend that optimization algorithms cannot be developed in isolation from the context in which RL formulations are proposed. By taking a joint perspective between RL and optimization, we expect our work to make RL more robust, more scalable, and more sample efficient, which we will illustrate with real-life applications.

Past Project:

Robust deep learning and generative models (Fabian Latorre)

The great empirical success of neural networks is upset by their fragility in the presence of mismatched or adversarially perturbed data. To overcome such issues, In this research project we study the worst-case robustness of neural networks as measured by their Lipschitz constant. We develop scalable optimization algorithms for its computation and we leverage such quantity to provide formal certificates of robustness. In order to train robust neural networks we develop algorithms that incorporate a penalty on an upper bound of their Lipschitz constant, namely, the complexity measure known as the 1-path-norm. This is a challenging task, given the non-convexity and non-smoothness of the underlying objective function. Coincidentally, it is also crucial to control the Lipschitzness of the discriminator network in the Generative Adversarial Networks (GANs) framework. Hence, we also explore the use of our developed algorithms in this context.