pith. sign in

A unified approach to reinforcement learning, quantal response equilibria, and two-player zero-sum games

3 Pith papers cite this work. Polarity classification is still indexing.

3 Pith papers citing it

fields

cs.AI 2 cs.LG 1

years

2026 1 2025 2

verdicts

UNVERDICTED 3

representative citing papers

Multiplayer Nash Preference Optimization

cs.AI · 2025-09-27 · unverdicted · novelty 6.0

MNPO extends NLHF to multiplayer Nash games, inheriting equilibrium guarantees while showing empirical gains on instruction-following benchmarks under diverse preferences.

citing papers explorer

Showing 3 of 3 citing papers.