Outline
lecture 1 | Introduction to Continuous Optimization |
|
Review of basic probability theory. Maximum likelihood, M-estimators, and empirical risk minimization as a motivation for convex optimization. |
|
Unconstrained smooth minimization I: Concept of an iterative optimization algorithm,Gradient descent. Convergence rateCharacterization of functions. |
|
Unconstrained smooth minimization II: Accelerated gradient methods |
|
Unconstrained smooth minimization III:
Adaptive gradient methods.
Newton’s method. Accelerated adaptive gradient methods. |
|
|
|
Stochastic gradient methods: Stochastic programming. Stochastic gradient descent. Variance reduction. |
|
Optimization for Deep Learning: From convex to nonconvex optimization. Neural networks. Saddle points problems. Generative Adversarial Networks. |
|
Composite minimization I: Subgradient method. Proximal and accelerated proximal gradient methods. |
|
Composite minimization II: Proximal gradient method for nonconvex problems. Proximal Newton-type methods. Stochastic proximal gradient methods. |
|
Constrained convex minimization I: The primal-dual approach. Smoothing approaches for non-smooth convex minimization. |
|
Constrained convex minimization II: The Conjugate gradient (Frank-Wolfe) method. Stochastic CGM. |