SoftGAC defines a stochastic bridge from base to action latent that converts the MaxEnt objective into a tractable relative-entropy term reducible to control energy, achieving competitive returns with one-pass sampling.
Flac: Maximum entropy rl via kinetic energy regularized bridge matching
2 Pith papers cite this work. Polarity classification is still indexing.
citation-role summary
citation-polarity summary
fields
cs.LG 2years
2026 2verdicts
UNVERDICTED 2roles
baseline 1polarities
baseline 1representative citing papers
Stochastic MeanFlow Policies enable one-step generative control in off-policy mirror descent by mapping noise through a MeanFlow transform, yielding tractable entropy and improved MuJoCo performance over Gaussian and generative baselines.
citing papers explorer
-
Generative Actor-Critic with Soft Bridge Policies
SoftGAC defines a stochastic bridge from base to action latent that converts the MaxEnt objective into a tractable relative-entropy term reducible to control energy, achieving competitive returns with one-pass sampling.
-
Stochastic MeanFlow Policies: One-Step Generative Control with Entropic Mirror Descent
Stochastic MeanFlow Policies enable one-step generative control in off-policy mirror descent by mapping noise through a MeanFlow transform, yielding tractable entropy and improved MuJoCo performance over Gaussian and generative baselines.