No-regret learning in games with coupling constraints

 

Motivation: Shared decision-making is at the heart of any multi-agent system. Each agent aims to maximize its reward which depends on the decisions of all agents by repeatedly playing an action and observing the corresponding reward. A common performance measure in such repeated games is regret and several learning algorithms exist which attain no-regret. The global decision of all agents, however, may be subject to constraints and thus such independent decision-making may result in an infeasible outcome. Coupling constraints arise in several real-world problems such as electricity markets, traffic routing, and multi-robot applications. In electricity markets, for example, there are capacity constraints on the load of each grid line. In multi-robot applications in unexplored environments, the robots have to accomplish a task while avoiding collision with obstacles and other robots. For games with coupling constraints it is crucial to develop learning algorithms that not only aim at minimizing agents regret but also ensure that agents learn to play a feasible outcome. To address this, this project aims to develop a no-regret learning algorithm that simultaneously ensures constraint satisfaction.

Outline In a first step, you will start by familiarizing yourself with the relevant literature on no-regret learning (see e.g. [1]) and on learning in games with constraints (see e.g. [2,3]). Next, the goal is to design a no-regret algorithm that ensures constraint satisfactions and provide theoretical guarantees for the developed algorithm. Depending on your own interest and background you can then address coupling constraints in Markov games, a class of repeated games with transition dynamics. Alternatively you focus on practical implementations of your developed algorithm for applications such as electricity markets, shared mobility systems (see e.g. [4]) and traffic routing.

Requirements: We seek for motivated students with a strong mathematical, or computer science background. We do have some concrete ideas on how to tackle the above challenges, but we are always open for different suggestions. If you are interested, please send an email containing 1. one paragraph on your background and fit for the project, 2. your BS and MS transcripts to anna.maddux@epfl.ch.

This project will be supervised by Prof. Maryam Kamgarpour (maryam.kamgarpour@epfl.ch) and Anna Maddux (anna.maddux@epfl.ch).

References:

  1. Sessa, P. G., Bogunovic, I., Kamgarpour, M., and Krause, A. (2019). “No-regret learning in unknown games with correlated payoffs”. Advances in Neural Information Processing Systems.

  2. Maddux, A. M., and Kamgarpour M. (2023). “Multi-Agent Learning in Contextual Games under Unknown Constraints”. arXiv preprint arXiv:2310.14685.

  3. Pragnya, A., Ramponi, G., He, N., and Krause, A. (2023). “Provably Learning Nash Policies in Constrained Markov Potential Games”. arXiv preprint arXiv:2306.07749.

  4. Sessa, P. G., Bogunovic, I., Krause, A., and Kamgarpour, M. (2021). “Online Submodular Resource Allocation with Applications to Rebalancing Shared Mobility Systems”. International Conference on Machine Learning.