Slides 2024

Outline

The 2024 course consists of the following topics
 

Lecture 01

  • Introduction.
  • Overview of Mathematics of Data
  • Empirical Risk Minimization
  • Statistical Learning with Maximum Likelihood Estimators

Lecture 02

  • Generalized linear model
  • Linear regression
  • M-estimator examples

Lecture 03

  • Linear algebra reminder
  • Convexity and Gradients
  • Convergence rates and convergence plots

Lecture 04

  • Principles of iterative descent methods
  • Structures in optimization
  • Gradient descent methods

Lecture 05

  • Optimality of convergence rates
  • Lower bounds
  • Accelerated gradient descent
  • Newton and Adaptive methods
  • Tensor methods

Lecture 06

  • Stochastic gradient descent
  • Concise signal models
  • Compressive sensing
  • Sample complexity bounds for estimation and prediction
  • Challenges to optimization algorithms for non-smooth optimization
  • Subgradi­ent method

Lecture 07

  • Composite minimization
  • Proximal gradient methods
  • Introduction to Frank-Wolfe method

Lecture 08

  • Variance reduction
  • Introduction to deep learning
  • Challenges in deep learning theory and applications

Lecture 09

  • The classical trade-off between model complexity and risk
  • Generalization bounds via uniform convergence
  • Generalization in deep learning
  • Implicit regularization of optimization algorithms
  • Double descent
  • Scaling Laws

Lecture 10

  • Adaptive gradient methods
  • Scalable non-convex optimization

Lecture 11

  • Adversarial machine learning
  • Wasserstein generative adversarial networks
  • Difficulty of minimax optimization.

Lecture 12

  • Convergence of minmax
  • Diffusion models
  • Robustness in deep learning

Lecture 13

  • Primal-dual optimization-I: Fundamentals of minimax problems
  • Fenchel conjugates
  • Du­ality

Lecture 14

  • Primal-dual optimization-II: Augmented Lagrangian grandient methods
  • Semi-definite programming
  • HCGM and CGAL algorithms

Lecture 15

  • Language models: Basis of language models.
  • Self attention and Transformer
  • GTP family