J. Cohen, A. Héliou, and P. Mertikopoulos. In SAGT '17: Proceedings of the 10th International Symposium on Algorithmic Game Theory, 2017.
This paper examines the problem of multi-agent learning in N-person non-cooperative games. For concreteness, we focus on the so-called “hedge” variant of the exponential weights (EW) algorithm, one of the most widely studied algorithmic schemes for regret minimization in online learning. In this multi-agent context, we show that a) dominated strategies become extinct (a.s.); and b) in generic games, pure Nash equilibria are attracting with high probability, even in the presence of uncertainty and noise of arbitrarily high variance. Moreover, if the algorithm’s step-size does not decay too fast, we show that these properties occur at a quasi-exponential rate that is, much faster than the algorithm’s $O(1/\sqrt{T})$ worst-case regret guarantee would suggest.
arXiv link: https://arxiv.org/abs/1607.08863