Derives transformer-like dual-filter inference layers from first-principles optimal control on nonlinear discrete and linear Gaussian sequence models.
General duality between optimal control and estimation
2 Pith papers cite this work. Polarity classification is still indexing.
2
Pith papers citing it
fields
cs.LG 2verdicts
UNVERDICTED 2representative citing papers
DR-SAC is the first actor-critic distributionally robust RL algorithm for offline continuous control that derives a convergent robust soft policy iteration and reports up to 9.8x higher rewards than SAC under perturbations.
citing papers explorer
-
Transformer-like Inference from Optimal Control
Derives transformer-like dual-filter inference layers from first-principles optimal control on nonlinear discrete and linear Gaussian sequence models.
-
DR-SAC: Distributionally Robust Soft Actor-Critic for Reinforcement Learning under Uncertainty
DR-SAC is the first actor-critic distributionally robust RL algorithm for offline continuous control that derives a convergent robust soft policy iteration and reports up to 9.8x higher rewards than SAC under perturbations.