RL agents' rationality is quantified via expected value discrepancy to optimal actions, with the training-deployment gap decomposed and bounded by Wasserstein distance and Rademacher complexity, supported by experiments on regularizers.
URL http: //www.jstor.org/stable/2778894
1 Pith paper cite this work. Polarity classification is still indexing.
1
Pith paper citing it
fields
cs.LG 1years
2026 1verdicts
UNVERDICTED 1representative citing papers
citing papers explorer
-
Rationality Measurement and Theory for Reinforcement Learning Agents
RL agents' rationality is quantified via expected value discrepancy to optimal actions, with the training-deployment gap decomposed and bounded by Wasserstein distance and Rademacher complexity, supported by experiments on regularizers.