On the Properties of the Softmax Function with Application in Game Theory and Reinforcement Learning
read the original abstract
In this paper, we utilize results from convex analysis and monotone operator theory to derive additional properties of the softmax function that have not yet been covered in the existing literature. In particular, we show that the softmax function is the monotone gradient map of the log-sum-exp function. By exploiting this connection, we show that the inverse temperature parameter determines the Lipschitz and co-coercivity properties of the softmax function. We then demonstrate the usefulness of these properties through an application in game-theoretic reinforcement learning.
This paper has not been read by Pith yet.
Forward citations
Cited by 11 Pith papers
-
Sharp Spectral Thresholds for Logit Fixed Points
For finite-dimensional affine logit systems the sharp dimension-free stability threshold is β‖ΠWΠ‖_{T→T}<2, extending the certified regime beyond classical conservative bounds.
-
On Bayesian Softmax-Gated Mixture-of-Experts Models
Bayesian softmax-gated mixture-of-experts models achieve posterior contraction for density estimation and parameter recovery using Voronoi losses, plus two strategies for choosing the number of experts.
-
A Minimal-Assumption Analysis of Q-Learning with Time-Varying Policies
Establishes last-iterate convergence rates for on-policy Q-learning under minimal irreducibility assumptions, with sample complexity O(1/ξ²) matching off-policy up to exploration factors.
-
Rethinking Gating Mechanism in Sparse MoE: Handling Arbitrary Modality Inputs with Confidence-Guided Gate
ConfSMoE adds expert-opinion imputation and detaches softmax routing scores to ground-truth task confidence to relieve expert collapse in SMoE without extra load-balance losses, evaluated on four real-world datasets.
-
Optimizing Server Placement for Vertical Federated Learning in Dynamic Edge/Fog Networks
SC-DN establishes a global first-order stationary point per round and solves a mixed-integer signomial program to optimize four control variables for VFL, yielding better classification performance and lower resource ...
-
Rethinking Intrinsic Dimension Estimation in Neural Representations
Common ID estimators fail to track the true intrinsic dimension of neural representations and are instead driven by other factors.
-
Learning Empirical Evidence Equilibria under Weak Environmental Coupling
Decentralized Q-learning agents reach an Empirical Evidence Equilibrium in weakly coupled dynamic environments.
-
Learning Empirical Evidence Equilibria under Weak Environmental Coupling
Proves that Empirical Evidence Equilibria emerge from decentralized Q-value iteration in games with weak environmental coupling, with an extension to softmax policies under a contraction condition.
-
Informative Graph Structure Learning
InGSL reduces edge redundancy in existing graph structure learning methods by adding a mutual-information-guided diversity term, delivering better results with fewer edges across six tested frameworks.
-
Structure-Centric Graph Foundation Model via Geometric Bases
SCGFM creates transferable graph representations by aligning heterogeneous topologies to shared learnable geometric bases via Gromov-Wasserstein distances and re-encoding features accordingly.
-
Learning Cut Distributions with Quantum Optimization
QAOA ansatz with finite layers can capture any bitstring distribution and solves the Fair Cut Cover problem with provable and empirical advantages over classical approximations on certain graphs.
discussion (0)
Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.