Computational Performance of Deep Reinforcement Learning to Find Nash Equilibria

Christoph Graf, Viktor Zobernig, Johannes Schmidt, Claude Klöckl

Publikation: Beitrag in FachzeitschriftArtikelBegutachtung


We test the performance of deep deterministic policy gradient—a deep reinforcement learning algorithm, able to handle continuous state and action spaces—to find Nash equilibria in a setting where firms compete in offer prices through a uniform price auction. These algorithms are typically considered “model-free” although a large set of parameters is utilized by the algorithm. These parameters may include learning rates, memory buffers, state space dimensioning, normalizations, or noise decay rates, and the purpose of this work is to systematically test the effect of these parameter configurations on convergence to the analytically derived Bertrand equilibrium. We find parameter choices that can reach convergence rates of up to 99%. We show that the algorithm also converges in more complex settings with multiple players and different cost structures. Its reliable convergence may make the method a useful tool to studying strategic behavior of firms even in more complex settings.
Seiten (von - bis)1-48
FachzeitschriftSpringer Computational Economics
PublikationsstatusVeröffentlicht - 3 Jan. 2023

Research Field

  • Ehemaliges Research Field - Integrated Energy Systems


  • Bertrand equilibrium · Competition in uniform price auctions
  • Deep deterministic policy gradient algorithm
  • DDPG
  • Parameter sensitivity analysis


Untersuchen Sie die Forschungsthemen von „Computational Performance of Deep Reinforcement Learning to Find Nash Equilibria“. Zusammen bilden sie einen einzigartigen Fingerprint.

Diese Publikation zitieren