Slides 2023

Outline

The 2023 course consists of the following topics

Introduction.
The role of models and data
Maximum-likelihood formulation
Sample complexity bound for estimation and prediction

Generalized linear model
Logistic regression

Linear algebra reminder
Gradients
Reading convergence plots

Optimization algorithms
Optimality measures.
Structures in optimization
Gradient descent. Gradient descent for smooth functions

Optimality of convergence rates
Lower bounds
Accelerated gradient descent
Concept of total complexity
Adaptive methods
Tensor methods

Stochastic gradient descent
Concise signal models
Compressive sensing
Sample complexity bounds for estimation and prediction
Challenges to optimization algorithms for non-smooth optimization
Subgradient method

Introduction to proximal-operators
Proximal gradient methods
Linear minimization oracles
Conditional gradient method for constrained optimization

Variance reduction
Introduction to deep learning
Challenges in deep learning theory and applications

Generalization through uniform convergence bounds
Rademacher complexity
Double descent curves and overparameterization
Implicit regularization
Generalization bounds using stability

Escaping saddle points
Adaptive gradient methods

Adversarial machine learning and generative adversarial networks (GANs)
Wasserstein GAN
Difficulty of minimax optimization.

Robustness in deep learning
Diffusion models

Primal-dual optimization-I: Fundamentals of minimax problems
Fenchel conjugates
Duality
Extra gradient method
Chambolle-Pock algorithm
Stochastic primal-dual methods

Primal-dual optimization-II: Augmented Lagrangian grandient methods
Semi-definite programming
HCGM and CGAL algorithms

Language models: Basis of language models.
Self attention and Transformer
GTP family