ATRS uses a shared neural policy in a multi-agent MDP to adaptively re-split trajectory segments during parallel ADMM optimization, cutting iterations by up to 26% and time by 19.1% with zero-shot generalization.
Addressing function approxi- mation error in actor-critic methods,
2 Pith papers cite this work. Polarity classification is still indexing.
fields
cs.RO 2years
2026 2verdicts
UNVERDICTED 2representative citing papers
TFM-S3 uses a tabular foundation model to predict returns and guide intermittent global exploration within an SVD-derived policy subspace, yielding faster early convergence and better final performance than TD3 and population-based methods under fixed rollout budgets.
citing papers explorer
-
ATRS: Adaptive Trajectory Re-splitting via a Shared Neural Policy for Parallel Optimization
ATRS uses a shared neural policy in a multi-agent MDP to adaptively re-split trajectory segments during parallel ADMM optimization, cutting iterations by up to 26% and time by 19.1% with zero-shot generalization.
-
Can Tabular Foundation Models Guide Exploration in Robot Policy Learning?
TFM-S3 uses a tabular foundation model to predict returns and guide intermittent global exploration within an SVD-derived policy subspace, yielding faster early convergence and better final performance than TD3 and population-based methods under fixed rollout budgets.