Derives an explicit Gâteaux policy-gradient formula for entropy-regularized continuous-time mean-field control using the value function and cylindrical representations, then builds a model-based actor-critic scheme with PDE well-posedness analysis.
Reinforcement learning in continuous time and space: A stochastic control approach.Journal of Machine Learning Research, 21(198):1–34
2 Pith papers cite this work. Polarity classification is still indexing.
years
2026 2representative citing papers
AID amortizes guidance for diffusion inpainting by training a reusable module via an auxiliary Gaussian formulation and continuous-time actor-critic algorithm, improving quality-speed trade-off with under 1% overhead.
citing papers explorer
-
Policy Gradient for Continuous-Time Mean-Field Control
Derives an explicit Gâteaux policy-gradient formula for entropy-regularized continuous-time mean-field control using the value function and cylindrical representations, then builds a model-based actor-critic scheme with PDE well-posedness analysis.
-
Amortized Guidance for Image Inpainting with Pretrained Diffusion Models
AID amortizes guidance for diffusion inpainting by training a reusable module via an auxiliary Gaussian formulation and continuous-time actor-critic algorithm, improving quality-speed trade-off with under 1% overhead.