Multi-armed restless bandit problems (MARBPs) substantially extend the modeling capability of classical multi-armed bandit problems. MARBPs are Markov decision process models for optimal dynamic priority allocation to a collection of stochastic binary-action (active/passive) projects evolving over time. Interest in MARBPs has grown steadily, spurred by the breadth of their possible applications and because it offers an opportunity for rich theoretical investigations. The goal of this associate team is to develop scalable learning algorithms adapted to Multi-MARBPs.
In the context of the the associate team AIRBA and the ANR Refino, we organized a workshop on restless bandits in Grenoble (November 2023).