Reinforcement Learning slides 2024

Outline

The 2024 course consists of the following topics
 

Lecture 01

  • Introduction

Lecture 02

  • MDPs; value and Q-functions; value iteration, policy iteration; operator perspectives. Model-free policy-based and value-based methods; connections to gradient methods; Monte Carlo (MC) method and temporal difference (TD) learning.

Lecture 03

  • Primal and Dual LP, primal-dual methods, REPS.r algebra reminder

Lecture 04

  • Policy parameterizations, policy gradient theorems and estimators, performance difference lemma, gradient dominance and convergence of policy gradient methods, narual policy gradient

Lecture 05

  • NPG, sample-based NPG, TRPO, exploration in policy gradients

Lecture 06

  • Behavioral cloning, dagger, MCE-IRL, GAIL, P2IL, IQ-Learn

Lecture 07

  • NFG, equilibria, response dynamics of iterated play, Markov games, RL dynamics in Markov games

Lecture 08

  • Actor Critic based Deep RL: TRPO, Soft Actor Critic.Value based Deep RL: DQN, Double DQN, Rainbow.Robust RL and IRL.