Path Integral Control for Partially Observed Systems with Controlled Sensing
Pith reviewed 2026-05-10 02:45 UTC · model grok-4.3
The pith
Treating the observation matrix as a control input and restricting it to a measurable selector from the matching set reduces the Hamilton-Jacobi-Bellman equation in Gaussian belief space to a linear PDE with a Feynman-Kac representation.
A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.
Core claim
Constraining the sensing control to a measurable selector from the matching set reduces the Hamilton-Jacobi-Bellman equation for the belief mean and covariance to a linear PDE with a Feynman-Kac representation.
What carries the argument
The matching set of observation matrices that align the diffusion of the belief mean with the actuation authority, together with the choice of a measurable selector from that set as the sensing control.
If this is right
- The optimal cost-to-go in Gaussian belief space can be expressed as an expectation under a controlled diffusion, enabling Monte Carlo approximation.
- Path integral control extends directly to partially observed linear systems once the observation matrix is allowed to vary.
- The separation principle between estimation and control is retained while the sensing policy is chosen to satisfy the matching condition.
- Numerical solution of the original nonlinear problem is replaced by simulation of the linear PDE via the Feynman-Kac representation.
Where Pith is reading between the lines
- Real-time selection of the observation matrix could be performed by checking membership in the matching set at each step.
- The same selector idea might be tested on discrete-time or switched linear systems to see whether the linear-PDE reduction survives.
- Joint optimization of actuation and sensing policies becomes feasible once both are expressed through the same matching framework.
Load-bearing premise
A measurable selector from the matching set always exists and can be applied while preserving the Gaussian belief structure without adding constraints that invalidate the reduction.
What would settle it
A concrete linear system for which no measurable selector from the matching set exists, or for which applying any such selector changes the belief dynamics so that they cease to remain Gaussian, would show the claimed reduction does not hold.
Figures
read the original abstract
Path integral control in Gaussian belief space requires a structural matching condition between the observation-driven diffusion of the belief mean and the actuation authority, which a fixed observation matrix cannot enforce. We treat the observation matrix as a control variable and show that constraining the sensing control to a measurable selector from the resulting matching set reduces the Hamilton-Jacobi-Bellman equation for the belief mean and covariance to a linear PDE with a Feynman-Kac representation.
Editorial analysis
A structured set of objections, weighed in public.
Referee Report
Summary. The paper claims that path integral control in Gaussian belief space requires a structural matching condition between observation-driven diffusion of the belief mean and actuation authority that a fixed observation matrix cannot enforce. By treating the observation matrix as a control variable and constraining the sensing control to a measurable selector from the resulting matching set, the Hamilton-Jacobi-Bellman equation for the belief mean and covariance reduces to a linear PDE with a Feynman-Kac representation.
Significance. If the reduction is rigorously established, including selector existence and invariance of the Gaussian belief structure, the result would meaningfully extend path integral methods to partially observed systems with controlled sensing, enabling tractable solutions to otherwise nonlinear HJB problems in belief space. The approach correctly leverages standard HJB and Feynman-Kac machinery but hinges on the novel controlled-sensing mechanism to achieve linearity.
major comments (2)
- [Abstract and derivation of the reduced PDE] The central reduction (abstract and main theorem) requires that, for every admissible belief state, a measurable selector from the matching set exists such that the observation-driven diffusion exactly cancels the control authority term while leaving covariance evolution unchanged. The manuscript provides no explicit construction, existence proof, or invariance argument showing that the selector preserves Gaussianity and does not introduce state-dependent jumps or non-Gaussianity; this assumption is load-bearing for the linearity claim.
- [Belief dynamics and matching condition] The statement that covariance dynamics remain unaffected by selector choice (belief dynamics section) is asserted without an explicit invariance proof under the selector; if the selector depends on the instantaneous belief in a manner that couples back into the diffusion terms, the Gaussian closure and subsequent Feynman-Kac representation would fail.
minor comments (2)
- [Introduction] Define the matching set and the selector function u_s(t, belief) with precise notation in the introduction or preliminary section rather than deferring to later technical development.
- [Preliminaries] Add a short remark on the dimension of the observation matrix relative to the state to clarify when the matching set is non-empty.
Simulated Author's Rebuttal
We thank the referee for the careful reading and constructive comments. The feedback highlights important points regarding the rigor of the selector existence and invariance arguments, which we address below. We will revise the manuscript to strengthen these aspects while preserving the core contribution.
read point-by-point responses
-
Referee: [Abstract and derivation of the reduced PDE] The central reduction (abstract and main theorem) requires that, for every admissible belief state, a measurable selector from the matching set exists such that the observation-driven diffusion exactly cancels the control authority term while leaving covariance evolution unchanged. The manuscript provides no explicit construction, existence proof, or invariance argument showing that the selector preserves Gaussianity and does not introduce state-dependent jumps or non-Gaussianity; this assumption is load-bearing for the linearity claim.
Authors: We agree that the current version does not include an explicit construction or full existence proof for the measurable selector. The matching set is defined pointwise for each belief state via the structural condition on the observation-driven diffusion term. By standard measurable selection results (Kuratowski-Ryll-Nardzewski theorem), a measurable selector exists whenever the set-valued map is closed-valued and nonempty, which holds under the problem's controllability and observability assumptions. The selector acts only on the mean diffusion term and does not alter the covariance Riccati dynamics, thereby preserving the Gaussian closure. We will add an appendix providing the explicit selector construction via the theorem and a short invariance argument confirming that no state-dependent jumps or non-Gaussianity are introduced. revision: yes
-
Referee: [Belief dynamics and matching condition] The statement that covariance dynamics remain unaffected by selector choice (belief dynamics section) is asserted without an explicit invariance proof under the selector; if the selector depends on the instantaneous belief in a manner that couples back into the diffusion terms, the Gaussian closure and subsequent Feynman-Kac representation would fail.
Authors: The covariance evolution follows the standard Kalman-Bucy Riccati equation, which depends on the chosen observation matrix but is independent of the particular selector value once the matrix is fixed. Because the selector is chosen only to satisfy the mean-matching condition and the covariance update does not feed back into the selector definition in a way that violates closure, Gaussianity is preserved. We acknowledge the assertion was stated without a dedicated invariance lemma. In the revision we will insert a short proof in the belief-dynamics section showing that the selector, being a measurable function of the current belief, leaves the covariance ODE unchanged and therefore maintains the conditions required for the Feynman-Kac representation. revision: yes
Circularity Check
No significant circularity; derivation applies standard HJB reduction under an introduced control constraint.
full rationale
The paper defines a matching condition required for path-integral control in Gaussian belief space, treats the observation matrix as a control input to form the matching set, and then shows that selecting from this set reduces the HJB to a linear PDE admitting a Feynman-Kac representation. This is a direct substitution of the matching condition into the HJB equation rather than a self-referential definition or fitted parameter renamed as a prediction. No self-citation chain is load-bearing for the central reduction, and the derivation remains self-contained against the external benchmarks of stochastic control theory and Feynman-Kac representations. The existence of a measurable selector is an assumption whose verification lies outside the circularity patterns.
Axiom & Free-Parameter Ledger
axioms (2)
- domain assumption Path integral control applies in Gaussian belief space
- domain assumption A structural matching condition between observation-driven diffusion of the belief mean and actuation authority is required
Reference graph
Works this paper leans on
-
[1]
Path integrals and symmetry breaking for optimal control theory,
H. J. Kappen, “Path integrals and symmetry breaking for optimal control theory,”J. Stat. Mech.: Theory Exp., vol. 2005, no. 11, p. P11011, 2005
work page 2005
-
[2]
Linear theory for control of nonlinear stochastic systems,
——, “Linear theory for control of nonlinear stochastic systems,” Phys. Rev. Lett., vol. 95, no. 20, p. 200201, 2005
work page 2005
-
[3]
Linearly-solvable Markov decision problems,
E. Todorov, “Linearly-solvable Markov decision problems,”Adv. Neu- ral Inf. Process. Syst., vol. 19, 2006
work page 2006
-
[4]
A generalized path integral control approach to reinforcement learning,
E. Theodorou, J. Buchli, and S. Schaal, “A generalized path integral control approach to reinforcement learning,”J. Mach. Learn. Res., vol. 11, pp. 3137–3181, 2010
work page 2010
-
[5]
Model predictive path integral control: From theory to parallel computation,
G. Williams, A. Aldrich, and E. A. Theodorou, “Model predictive path integral control: From theory to parallel computation,”J. Guid. Control Dyn., vol. 40, no. 2, pp. 344–357, 2017
work page 2017
-
[6]
Information-theoretic model predictive control: Theory and applica- tions to autonomous driving,
G. Williams, P. Drews, B. Goldfain, J. M. Rehg, and E. A. Theodorou, “Information-theoretic model predictive control: Theory and applica- tions to autonomous driving,”IEEE Trans. Robot., vol. 34, no. 6, pp. 1603–1622, 2018
work page 2018
-
[7]
Path integral control and state- dependent feedback,
S. Thijssen and H. J. Kappen, “Path integral control and state- dependent feedback,”Phys. Rev. E, vol. 91, no. 3, p. 032104, 2015
work page 2015
-
[8]
Robust model predictive path integral control: Analysis and performance guarantees,
M. S. Gandhi, B. Vlahov, J. Gibson, G. Williams, and E. A. Theodorou, “Robust model predictive path integral control: Analysis and performance guarantees,”IEEE Robot. Autom. Lett., vol. 6, no. 2, pp. 1423–1430, 2021
work page 2021
-
[9]
Risk-aware model predictive path integral control,
J. Yin, A. Iyer, and E. A. Theodorou, “Risk-aware model predictive path integral control,” inProc. Amer. Control Conf. (ACC), 2023, pp. 3467–3474
work page 2023
-
[10]
Dynamical equations for optimal nonlinear filtering,
H. J. Kushner, “Dynamical equations for optimal nonlinear filtering,” J. Differ. Equ., vol. 3, no. 2, pp. 179–190, 1967
work page 1967
-
[11]
R. S. Liptser and A. N. Shiryaev,Statistics of Random Processes: General Theory. Springer, 1977, vol. 394
work page 1977
-
[12]
Bensoussan,Stochastic Control of Partially Observable Systems
A. Bensoussan,Stochastic Control of Partially Observable Systems. Cambridge Univ. Press, 1992
work page 1992
-
[13]
Krishnamurthy,Partially Observed Markov Decision Processes
V . Krishnamurthy,Partially Observed Markov Decision Processes. Cambridge Univ. Press, 2016
work page 2016
-
[14]
Be- lief space planning assuming maximum likelihood observations,
R. Platt Jr, R. Tedrake, L. Kaelbling, and T. Lozano-Perez, “Be- lief space planning assuming maximum likelihood observations,” in Robot.: Sci. Syst. (RSS), 2010
work page 2010
-
[15]
Motion planning under uncertainty using iterative local optimization in belief space,
J. Van Den Berg, S. Patil, and R. Alterovitz, “Motion planning under uncertainty using iterative local optimization in belief space,”Int. J. Robot. Res., vol. 31, no. 11, pp. 1263–1278, 2012
work page 2012
-
[16]
E. Todorov and W. Li, “A generalized iterative LQG method for locally-optimal feedback control of constrained nonlinear stochastic systems,” inProc. Amer. Control Conf. (ACC), 2005, pp. 300–306
work page 2005
-
[17]
Path integral control in Gaussian belief space for partially observed systems,
G. Das and T. Tanaka, “Path integral control in Gaussian belief space for partially observed systems,”arXiv preprint, 2026, available: https: //doi.org/10.XXXX/arXiv.XXXX.XXXXX
work page 2026
-
[18]
I. Karatzas and S. E. Shreve,Brownian Motion and Stochastic Calcu- lus, 2nd ed. Springer, 1991
work page 1991
discussion (0)
Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.