A bound on OOD test performance in POMDPs decomposes loss into approximation and estimation errors, indicating that smaller abstract state spaces improve generalization in RL agents.
Jared Kaplan, Sam McCandlish, Tom Henighan, Tom B
1 Pith paper cite this work. Polarity classification is still indexing.
1
Pith paper citing it
fields
cs.LG 1years
2026 1verdicts
UNVERDICTED 1representative citing papers
citing papers explorer
-
Smaller Abstract State Spaces Enable Cross-Scale Generalization in Reinforcement Learning
A bound on OOD test performance in POMDPs decomposes loss into approximation and estimation errors, indicating that smaller abstract state spaces improve generalization in RL agents.