Panayotis Mertikopoulos
About
Short Bio
Publications
Collaborations
Content tagged with
online learning
[W9] On the discrete-time origins of the replicator dynamics: From convergence to instability and chaos
[J44] Nested replicator dynamics, nested logit choice, and similarity-based learning
[J41] A unified stochastic approximation framework for learning in games
[C94] Accelerated regularized learning in finite $N$-person games
[C93] No-regret learning in harmonic games: Extrapolation in the presence of conflicting interests
[C92] A geometric decomposition of finite games: Convergence vs. recurrence under exponential weights
[W6] Learning in quantum games
[J40] Multi-agent online learning in time-varying games
[C89] A quadratic speedup in finding Nash equilibria of quantum zero-sum games
[C88] The equivalence of dynamic and strategic stability under regularized learning in games
[C86] Payoff-based learning with matrix multiplicative weights in quantum games
[C85] Exploiting hidden structures in non-convex games for convergence to Nash equilibrium
[J38] Learning in nonatomic games, Part I: Finite action spaces and population games
[J36] Multi-agent online optimization with delays: Asynchronicity, adaptivity, and optimism
[C83] On the convergence of policy gradient methods to Nash equilibria in general stochastic games
[C82] No-regret learning in games with noisy feedback: Faster rates and adaptivity via learning rate separation
[C80] Learning in games with quantized payoff observations
[C79] Online convex optimization in wireless networks and beyond: The feedback-performance trade-off
[C76] Nested bandits
[W5] A heuristic for estimating Nash equilibria in first-price auctions with correlated values
[C72] The convergence rate of regularized learning in games: From bandits and uncertainty to optimism and beyond
[C69] Equilibrium tracking and convergence in dynamic games
[C68] Optimization in open networks via dual averaging
[C67] Adaptive learning in continuous games: Optimal regret bounds and convergence to Nash equilibrium
[C66] Survival of the strictest: Stable and unstable equilibria under regularized learning with partial information
[C63] Zeroth-order non-convex learning via hierarchical dual averaging
[C62] Regret minimization in stochastic non-convex learning via a proximal-gradient approach
[C59] No-regret learning and mixed Nash equilibria: They do not mix
[C58] Online non-convex optimization with imperfect feedback
[C56] A new regret analysis for Adam-type algorithms
[C55] Gradient-free online learning in continuous games with delayed rewards
[C54] Finite-time last-iterate convergence for multi-agent learning in games
[C52] Derivative-free optimization over multi-user MIMO networks
[C51] Online and stochastic optimization beyond Lipschitz continuity: A Riemannian approach
[D3] Online optimization and learning in games: Theory and applications
[J28] Online power optimization in feedback-limited, dynamic and unpredictable IoT networks
[J27] Learning in games with continuous action sets and unknown payoff functions
[C47] Gradient-free online resource allocation algorithms for dynamic wireless networks
[C46] Cautious regret minimization: Online optimization with long-term budget constraints
[W4] Multi-agent online learning with imperfect information
[W3] Online convex optimization and no-regret learning: Algorithms, guarantees and applications
[C41] Bandit learning in concave N-person games
[C40] Learning in games with lossy feedback
[C35] Cycles in adversarial regularized learning
[J23] On the robustness of learning in games with stochastically perturbed payoff observations
[J21] A continuous-time approach to online optimization
[J18] Mixed-strategy learning with continuous action sets
[C33] Countering feedback delays in multi-agent learning
[C31] Learning with bandit feedback in potential games
[C29] Power control in wireless networks via dual averaging
[C28] Mirror descent learning in continuous games
[C27] Convergence to Nash equilibrium in continuous games with noisy first-order feedback
[C26] Hedging under uncertainty: regret minimization meets exponentially fast convergence
[J17] Learning in games via reinforcement and regularization
[J14] Learning to be green: Robust energy efficiency maximization in dynamic MIMO-OFDM systems
[J12] Learning in an uncertain world: MIMO covariance matrix optimization with imperfect feedback
[C25] Interference mitigation via pricing in time-varying cognitive radio systems
[C23] Online interference mitigation via learning in dynamic IoT environments
[C22] Online power allocation for opportunistic radio access in dynamic OFDM networks
[W1] Power control via online learning in non-stationary MIMO networks
[J8] Penalty-regulated dynamics and robust learning procedures in games
[C19] No more tears: A no-regret approach to power control in dynamically varying MIMO networks
[C18] Energy-efficient power allocation in dynamic multi-carrier systems
[J7] Transmit without regrets: Online optimization in MIMO–OFDM cognitive radio systems
[C17] No regrets: Distributed power control under time-varying channels and QoS requirements
[J6] Higher-order game dynamics
[C13] Adaptive spectrum management in MIMO-OFDM cognitive radio: An exponential learning approach
[C11] Accelerating population-based search heuristics by adaptive resource allocation
[J2] The emergence of rational behavior in the presence of stochastic perturbations
[C4] Learning in the presence of noise
Nifty
tech tag lists
from
Wouter Beeftink