SAVGO unifies representation learning, value estimation, and policy optimization by embedding state-action pairs such that cosine similarity reflects action-value similarity, enabling similarity-kernel-guided policy improvement.
Proceedings of the 37th International Conference on Machine Learning , articleno =
4 Pith papers cite this work. Polarity classification is still indexing.
citation-role summary
citation-polarity summary
years
2026 4verdicts
UNVERDICTED 4roles
background 1polarities
background 1representative citing papers
DIVE proposes a dimensionality-reduction adapter using self-limiting gradients and implicit view ensembles that outperforms prior adapters on all six BEIR datasets at every tested compression ratio.
A decoupling strategy optimizes object slots for holistic class identity during training and composes them at inference to achieve better generalization to unseen concepts in continual few-shot settings.
LLMs disperse meaning-preserving prompts internally instead of clustering them, which produces an excessively high upper bound on output log-probability differences via Taylor expansion and Cauchy-Schwarz.
citing papers explorer
-
SAVGO: Learning State-Action Value Geometry with Cosine Similarity for Continuous Control
SAVGO unifies representation learning, value estimation, and policy optimization by embedding state-action pairs such that cosine similarity reflects action-value similarity, enabling similarity-kernel-guided policy improvement.
-
DIVE: Embedding Compression via Self-Limiting Gradient Updates
DIVE proposes a dimensionality-reduction adapter using self-limiting gradients and implicit view ensembles that outperforms prior adapters on all six BEIR datasets at every tested compression ratio.
-
Unlocking Compositional Generalization in Continual Few-Shot Learning
A decoupling strategy optimizes object slots for holistic class identity during training and composes them at inference to achieve better generalization to unseen concepts in continual few-shot settings.
-
Understanding the Prompt Sensitivity
LLMs disperse meaning-preserving prompts internally instead of clustering them, which produces an excessively high upper bound on output log-probability differences via Taylor expansion and Cauchy-Schwarz.