pith. sign in

arxiv: 2509.16296 · v2 · pith:52X6ZBSJnew · submitted 2025-09-19 · 📡 eess.SY · cs.GT· cs.SY

Learning in Stackelberg Markov Games

classification 📡 eess.SY cs.GTcs.SY
keywords energystackelberglearningenvironmentsframeworkapproximationdesignequilibria
0
0 comments X
read the original abstract

Designing socially optimal policies in multi-agent environments is a fundamental challenge in both economics and artificial intelligence. This paper studies a general framework for learning Stackelberg equilibria in dynamic and uncertain environments, where a single leader interacts with a population of adaptive followers. Motivated by pressing real-world challenges such as equitable electricity tariff design for consumers with distributed energy resources (such as rooftop solar and energy storage), we formalize a class of Stackelberg Markov games and establish the existence and uniqueness of stationary Stackelberg equilibria under mild continuity and monotonicity conditions. We then extend the framework to incorporate a continuum of agents via mean-field approximation, yielding a tractable Stackelberg-Mean Field Equilibrium (S-MFE) formulation. To address the computational intractability of exact best-response dynamics, we introduce a softmax-based approximation and rigorously bound its error relative to the true Stackelberg equilibrium. Our approach enables scalable and stable learning through policy iteration without requiring full knowledge of follower objectives. We validate the framework on an energy market simulation, where a public utility or a state utility commission sets time-varying rates for a heterogeneous population of prosumers. Our results demonstrate that learned policies can simultaneously achieve economic efficiency, equity across income groups, and stability in energy systems. This work demonstrates how game-theoretic learning frameworks can support data-driven policy design in large-scale strategic environments, with applications to real-world systems like energy markets.

This paper has not been read by Pith yet.

discussion (0)

Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.

Forward citations

Cited by 1 Pith paper

Reviewed papers in the Pith corpus that reference this work. Sorted by Pith novelty score.

  1. Finite-Time Analysis of Q-Value Iteration for General-Sum Stackelberg Games

    cs.LG 2026-04 unverdicted novelty 7.0

    Provides the first finite-time convergence guarantees for Q-value iteration in general-sum Stackelberg Markov games.