LEO enables efficient all-goals learning in goal-conditioned RL by jointly predicting for all goals in one network pass, yielding >250x speedup over relabelling and better performance on Craftax.
arXiv preprint arXiv:2504.11054 , year=
2 Pith papers cite this work. Polarity classification is still indexing.
2
Pith papers citing it
years
2026 2verdicts
UNVERDICTED 2representative citing papers
Introduces a successor-measure adaptation that separates market dynamics from trading objectives inside the Avellaneda-Stoikov HJB framework to enable zero-shot quote adjustment.
citing papers explorer
-
Goal-Conditioned Agents that Learn Everything All at Once
LEO enables efficient all-goals learning in goal-conditioned RL by jointly predicting for all goals in one network pass, yielding >250x speedup over relabelling and better performance on Craftax.
-
Zero-shot adaptation to order book dynamics
Introduces a successor-measure adaptation that separates market dynamics from trading objectives inside the Avellaneda-Stoikov HJB framework to enable zero-shot quote adjustment.