[C98] - Invariance and concentration properties of gradient-based learning in games

K. Lotidis, N. Bambos, J. H. Blanchet, and P. Mertikopoulos. In CDC '25: Proceedings of the 64th IEEE Annual Conference on Decision and Control, 2025.

Abstract

In this paper, we study the long-run behavior of learning in strongly monotone games with stochastic, gradient-based feedback. For concreteness, we focus on the stochastic projected gradient (SPG) algorithm, and we examine the asymptotic distribution of its iterates when the method is run with constant step-size updates (the de facto choice for practical deployments of the algorithm). In contrast to variants of the method with a vanishing step-size case, SPG with a constant step-size does not converge: instead, it reaches a neighborhood of the game’s Nash equilibrium at an exponential rate, and then, due to persistent noise, it fluctuates in its vicinity without converging (occasionally moving away on rare occasions). We provide a theoretical quantification of this behavior by analyzing the Markovian structure of the process. Namely, we show that, regardless of the algorithm’s initialization, the distribution of its iterates converges at a geometric rate to a unique invariant measure which is concentrated in a neighborhood of the game’s Nash equilibrium. More explicitly, we quantify the degree of this concentration and the rate of convergence of the algorithm’s empirical frequency of play to the invariant measure of the process in Wasserstein distance, and we provide explicit bounds in terms of the method’s step-size, the variance of the noise entering the process, and the geometric features of the game’s payoff landscape.

arXiv link: <>

Nifty tech tag lists fromĀ Wouter Beeftink