Authors:
(1) Enis Chenchene, Department of Mathematics and Scientific Computing, University of Graz;
(2) Hui Huang, Department of Mathematics and Scientific Computing, University of Graz;
(3) Jinniao Qiu, Department of Mathematics and Statistics, University of Calgary.
Table of Links
2.1 Quantitative Laplace principle
2.2 Global convergence in mean-field law
3 Numerical experiments and 3.1 One-dimensional illustrative example
3.2 Nonlinear oligopoly games with several goods
4 Conclusion, Acknowledgments, Appendix, and References
Abstract
In this paper, we present a novel consensus-based zeroth-order algorithm tailored for nonconvex multiplayer games. The proposed method leverages a metaheuristic approach using concepts from swarm intelligence to reliably identify global Nash equilibria. We utilize a group of interacting particles, each agreeing on a specific consensus point, asymptotically converging to the corresponding optimal strategy. This paradigm permits a passage to the mean-field limit, allowing us to establish convergence guarantees under appropriate assumptions regarding initialization and objective functions. Finally, we conduct a series of numerical experiments to unveil the dependency of the proposed method on its parameters and apply it to solve a nonlinear Cournot oligopoly game involving multiple goods.
Keywords: Non-convex, multiplayer games, Nash equilibrium, swarm optimization, Laplace’s principle.
1 Introduction
Multiplayer games [48], ranging from strategic board games to complex economic systems, have always fascinated researchers due to their intricate dynamics and strategic interactions among multiple players. Understanding and analyzing the outcomes of these games is crucial in various fields such as economics [41], social sciences [30], and computer science [22]. In recent years, significant progress has also been made in developing advanced learning approaches within the field of artificial intelligence towards multiplayer scenarios. One notable example is the extension of adversarial learning, which was originally applied to settings with a single generator and discriminator, to accommodate multiple agents [43, 56, 62]. This process can also be observed in reinforcement learning, where multiplayer game theory has been integrated to enhance learning algorithms [8, 19, 42].
The search for global NE in non-convex settings remains an open problem [45, 46]. This challenge arises not only due to the absence of powerful tools compared to those available in convex scenarios but also due to the diversity of non-convex structures, which may require specific methodologies for resolution. While efficient techniques have been developed within convex conditions that have led to significant advancements in multiplayer game models [15, 61], these approaches may prove insufficient when confronted with non-convexity. In such cases, they may become trapped in local NE or approximate solutions while following pseudo-gradients, failing to converge to a global NE. Moreover, while inspiring breakthroughs have been made in solving non-convex two-player min-max games in different contexts, such as Polyak– Lojasiewicz cases [23, 50] or concave cases [44, 52], these advancements may not readily apply to multiplayer settings. This is because the global stationary conditions in multiplayer games are intricately coupled and cannot be independently addressed by each player. Consequently, novel effective methods are required for attaining them, which is exactly the main contribution of this article
In this paper we introduce a novel zero-order consensus-based approach for finding global NE points in multiplayer games. This method draws inspiration from the consensus-based optimization (CBO) framework initially introduced in [10,51]. CBO belongs to the family of global optimization methodologies, which leverages systems of interacting particles to achieve consensus around global minimizers of the cost functions. As part of the broader class of metaheuristics [2, 29], CBO orchestrates interactions between local improvement procedures and global strategies, utilizing both deterministic and stochastic processes. This interplay ultimately results in an efficient and robust procedure for exploring the solution space of the cost functions. More importantly, the CBO approach possesses an inherent advantage of being gradient-free. This characteristic makes it particularly desirable when dealing with cost functions that lack smoothness or when computing their derivatives is computationally expensive.
This paper is available on arxiv under CC BY 4.0 DEED license.