TFM-S3 uses a tabular foundation model to predict returns and guide intermittent global exploration within an SVD-derived policy subspace, yielding faster early convergence and better final performance than TD3 and population-based methods under fixed rollout budgets.
Completely derandomized self- adaptation in evolution strategies
1 Pith paper cite this work. Polarity classification is still indexing.
1
Pith paper citing it
fields
cs.RO 1years
2026 1verdicts
UNVERDICTED 1representative citing papers
citing papers explorer
-
Can Tabular Foundation Models Guide Exploration in Robot Policy Learning?
TFM-S3 uses a tabular foundation model to predict returns and guide intermittent global exploration within an SVD-derived policy subspace, yielding faster early convergence and better final performance than TD3 and population-based methods under fixed rollout budgets.