The terminal representation encodes reward-weighted trajectories like the default representation but as a lower-dimensional object usable directly for RL tasks without eigendecomposition.
Machado , title =
2 Pith papers cite this work. Polarity classification is still indexing.
2
Pith papers citing it
fields
cs.LG 2years
2026 2verdicts
UNVERDICTED 2representative citing papers
SAVGO unifies representation learning, value estimation, and policy optimization by embedding state-action pairs such that cosine similarity reflects action-value similarity, enabling similarity-kernel-guided policy improvement.
citing papers explorer
-
The Terminal Representation in Reinforcement Learning
The terminal representation encodes reward-weighted trajectories like the default representation but as a lower-dimensional object usable directly for RL tasks without eigendecomposition.
-
SAVGO: Learning State-Action Value Geometry with Cosine Similarity for Continuous Control
SAVGO unifies representation learning, value estimation, and policy optimization by embedding state-action pairs such that cosine similarity reflects action-value similarity, enabling similarity-kernel-guided policy improvement.