pith. sign in

Exploration-exploitation in constrained mdps

9 Pith papers cite this work. Polarity classification is still indexing.

9 Pith papers citing it

citation-role summary

background 3

citation-polarity summary

fields

cs.LG 8 cs.GT 1

years

2026 7 2024 2

roles

background 3

polarities

background 3

representative citing papers

Primal-Dual Policy Optimization for Linear CMDPs with Adversarial Losses

cs.LG · 2026-05-12 · unverdicted · novelty 7.0

A new primal-dual algorithm for adversarial linear CMDPs achieves the first sublinear regret and constraint violation bounds of order K to the 3/4 using weighted LogSumExp softmax policies with periodic mixing and regularized dual updates.

Online Resource Allocation With General Constraints

cs.GT · 2026-05-11 · unverdicted · novelty 7.0

An algorithm for online resource allocation with budget and general constraints achieves O(sqrt(T)) regret in stochastic and alpha-regret in adversarial regimes with bounded constraint violations.

Fairness in two-player zero-sum games with bandit feedback

cs.LG · 2026-05-31 · unverdicted · novelty 6.0

A reparametrization reduces fair zero-sum games under bandit feedback to standard games on a transformed matrix, enabling an Õ(T^{2/3}) regret bound for learning general mixed fair equilibria via an Explore-Then-Commit algorithm.

citing papers explorer

Showing 9 of 9 citing papers.