Outline
The 2014 course consists of the following topics
Lecture 1 | “Objects in Space”: Definitions of norms, inner products, and metrics for vector, matrix and tensor objects. |
Basics of complexity theory. | |
Lecture 2 | Maximum likelihood principle as a motivation for convex optimization. |
Fundamental structures in convex analysis, such as cones, smoothness, and conjugation. | |
Lecture 3 | Unconstrained, smooth minimization techniques. |
Gradient methods. | |
Variable metric algorithms. | |
Time-data tradeoffs in ML estimation. | |
Lecture 4 | Convex geometry of linear inverse problems. |
Structured data models (e.g., sparse and low-rank) and convex gauge functions and formulations that encourage these structures. | |
Computational aspects of gauge functions. | |
Lecture 5 |
Composite convex minimization. Regularized M-estimators.
|
Time-data tradeoffs in linear inverse problems. | |
Lecture 6 | Convex demixing. |
Statistical dimension. | |
Phase transitions in convex minimization. | |
Smoothing approaches for non-smooth convex minimization. | |
Lecture 7 | Constrained convex minimization-I. |
Introduction to convex duality. | |
Classical solution methods (the augmented Lagrangian method, alternating minimization algorithm, alternating direction method of multipliers, and the Frank-Wolfe method) and their deficiencie. | |
Lecture 8 | Constrained convex minimization-II. |
Variational gap characterizations and dual smoothing. | |
Scalable, black-box optimization techniques. | |
Time data-tradeoffs for linear inverse problems. | |
Lecture 9 | Classical black-box convex optimization techniques. |
Linear programming, semidefinite programming, and the interior point method (IPM). | |
Hierarchies of classical formulations. | |
Time and space complexity of the IPM. | |
Lecture 10 | Time-data tradeoffs in machine learning. |
Lecture 11 | Convex methods for Big Data I: Randomized coordinate descent methods. |
The Page Rank problem and Nesterov’s solution. | |
Composite formulations. | |
Lecture 12 | Convex methods for Big Data II: Stochastic gradient descent methods. |
Least squares: conjugate gradients vs. a simple stochastic gradient method. | |
Dual and gradient averaging schemes. | |
Stochastic mirror descent. | |
Lecture 13 | Randomized linear algebra routines for scalable convex optimization. |
Probabilistic algorithms for constructing approximate low-rank matrix decompositions. | |
Subset selection approaches. | |
Theoretical approximation guarantees. | |
Lecture 14 | Role of parallel and distributed computing. |
How to avoid communication bottlenecks and synchronization. | |
Consensus methods. | |
Memory lock-free, decentralized, and asynchronous algorithms. |