Path Integral Control for Partially Observed Systems with Controlled Sensing

Goutam Das; Takashi Tanaka

arxiv: 2604.18941 · v1 · submitted 2026-04-21 · 📡 eess.SY · cs.SY

Path Integral Control for Partially Observed Systems with Controlled Sensing

Goutam Das , Takashi Tanaka This is my paper

Pith reviewed 2026-05-10 02:45 UTC · model grok-4.3

classification 📡 eess.SY cs.SY

keywords path integral controlpartially observed systemscontrolled sensingGaussian belief spaceHamilton-Jacobi-Bellman equationFeynman-Kac representationoptimal control

0 comments

The pith

Treating the observation matrix as a control input and restricting it to a measurable selector from the matching set reduces the Hamilton-Jacobi-Bellman equation in Gaussian belief space to a linear PDE with a Feynman-Kac representation.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

Path integral control converts stochastic optimal control into sampling problems, but requires a structural match between how observations update the belief and how the system can be actuated. In partially observed linear systems a fixed observation matrix usually fails this match, leaving the optimality equation nonlinear and hard to solve. The paper treats the observation matrix itself as a decision variable and shows that any measurable selector taken from the resulting matching set makes the nonlinear Hamilton-Jacobi-Bellman equation for the belief mean and covariance collapse to a linear partial differential equation. That linear equation possesses an explicit probabilistic representation via the Feynman-Kac formula, turning the original control problem into one that can be addressed by sampling. A reader cares because many practical systems operate with incomplete state information, and this reduction may let sampling-based controllers handle partial observability without sacrificing the path-integral structure.

Core claim

Constraining the sensing control to a measurable selector from the matching set reduces the Hamilton-Jacobi-Bellman equation for the belief mean and covariance to a linear PDE with a Feynman-Kac representation.

What carries the argument

The matching set of observation matrices that align the diffusion of the belief mean with the actuation authority, together with the choice of a measurable selector from that set as the sensing control.

If this is right

The optimal cost-to-go in Gaussian belief space can be expressed as an expectation under a controlled diffusion, enabling Monte Carlo approximation.
Path integral control extends directly to partially observed linear systems once the observation matrix is allowed to vary.
The separation principle between estimation and control is retained while the sensing policy is chosen to satisfy the matching condition.
Numerical solution of the original nonlinear problem is replaced by simulation of the linear PDE via the Feynman-Kac representation.

Where Pith is reading between the lines

These are editorial extensions of the paper, not claims the author makes directly.

Real-time selection of the observation matrix could be performed by checking membership in the matching set at each step.
The same selector idea might be tested on discrete-time or switched linear systems to see whether the linear-PDE reduction survives.
Joint optimization of actuation and sensing policies becomes feasible once both are expressed through the same matching framework.

Load-bearing premise

A measurable selector from the matching set always exists and can be applied while preserving the Gaussian belief structure without adding constraints that invalidate the reduction.

What would settle it

A concrete linear system for which no measurable selector from the matching set exists, or for which applying any such selector changes the belief dynamics so that they cease to remain Gaussian, would show the claimed reduction does not hold.

Figures

Figures reproduced from arXiv: 2604.18941 by Goutam Das, Takashi Tanaka.

read the original abstract

Path integral control in Gaussian belief space requires a structural matching condition between the observation-driven diffusion of the belief mean and the actuation authority, which a fixed observation matrix cannot enforce. We treat the observation matrix as a control variable and show that constraining the sensing control to a measurable selector from the resulting matching set reduces the Hamilton-Jacobi-Bellman equation for the belief mean and covariance to a linear PDE with a Feynman-Kac representation.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Desk Editor's Note private letter to a colleague

The paper lets you control the observation matrix to enforce the matching condition and reduce the belief-space HJB to a linear PDE with Feynman-Kac form, but the existence and invariance of the required measurable selector are assumed rather than shown.

read the letter

The main point is that fixed observation matrices often fail the structural matching condition needed for path integral control in Gaussian belief space. By treating the observation matrix as a control input and restricting it to a measurable selector from the matching set, the authors reduce the HJB equation on the belief mean and covariance to a linear PDE that admits a Feynman-Kac representation. This directly addresses a limitation in earlier fixed-observation approaches to belief-space path integral control. The abstract is straightforward about building on standard HJB and Feynman-Kac machinery without adding free parameters or circular definitions, which keeps the claim clean on its own terms. If the selector can be constructed reliably, the reduction would give a practical way to apply these methods in partially observed settings such as robotics or autonomous systems. The work is new in its use of dynamic sensing to enforce the condition rather than assuming it holds for a given sensor. The soft spot is the unproven claim that such a selector always exists for reachable beliefs and that choosing from the set leaves the covariance evolution and Gaussian structure intact. The abstract simply states that the constraint reduces the equation, but provides no explicit construction, invariance proof, or example showing the set is nonempty and the choice does not introduce jumps or non-Gaussianity. If the matching set is empty for some beliefs or the selector affects the filter in unintended ways, the linearity fails. This is the part that needs the full derivation and verification. The paper is for researchers already comfortable with stochastic control, belief-space planning, and path integral methods. A reader who knows the standard reduction steps will follow the extension quickly and can judge whether the selector issue is minor or central. It deserves a serious referee because the core mechanism targets a real structural gap and the reduction, if it holds, would be worth checking in detail rather than desk-rejecting outright.

Referee Report

2 major / 2 minor

Summary. The paper claims that path integral control in Gaussian belief space requires a structural matching condition between observation-driven diffusion of the belief mean and actuation authority that a fixed observation matrix cannot enforce. By treating the observation matrix as a control variable and constraining the sensing control to a measurable selector from the resulting matching set, the Hamilton-Jacobi-Bellman equation for the belief mean and covariance reduces to a linear PDE with a Feynman-Kac representation.

Significance. If the reduction is rigorously established, including selector existence and invariance of the Gaussian belief structure, the result would meaningfully extend path integral methods to partially observed systems with controlled sensing, enabling tractable solutions to otherwise nonlinear HJB problems in belief space. The approach correctly leverages standard HJB and Feynman-Kac machinery but hinges on the novel controlled-sensing mechanism to achieve linearity.

major comments (2)

[Abstract and derivation of the reduced PDE] The central reduction (abstract and main theorem) requires that, for every admissible belief state, a measurable selector from the matching set exists such that the observation-driven diffusion exactly cancels the control authority term while leaving covariance evolution unchanged. The manuscript provides no explicit construction, existence proof, or invariance argument showing that the selector preserves Gaussianity and does not introduce state-dependent jumps or non-Gaussianity; this assumption is load-bearing for the linearity claim.
[Belief dynamics and matching condition] The statement that covariance dynamics remain unaffected by selector choice (belief dynamics section) is asserted without an explicit invariance proof under the selector; if the selector depends on the instantaneous belief in a manner that couples back into the diffusion terms, the Gaussian closure and subsequent Feynman-Kac representation would fail.

minor comments (2)

[Introduction] Define the matching set and the selector function u_s(t, belief) with precise notation in the introduction or preliminary section rather than deferring to later technical development.
[Preliminaries] Add a short remark on the dimension of the observation matrix relative to the state to clarify when the matching set is non-empty.

Simulated Author's Rebuttal

2 responses · 0 unresolved

We thank the referee for the careful reading and constructive comments. The feedback highlights important points regarding the rigor of the selector existence and invariance arguments, which we address below. We will revise the manuscript to strengthen these aspects while preserving the core contribution.

read point-by-point responses

Referee: [Abstract and derivation of the reduced PDE] The central reduction (abstract and main theorem) requires that, for every admissible belief state, a measurable selector from the matching set exists such that the observation-driven diffusion exactly cancels the control authority term while leaving covariance evolution unchanged. The manuscript provides no explicit construction, existence proof, or invariance argument showing that the selector preserves Gaussianity and does not introduce state-dependent jumps or non-Gaussianity; this assumption is load-bearing for the linearity claim.

Authors: We agree that the current version does not include an explicit construction or full existence proof for the measurable selector. The matching set is defined pointwise for each belief state via the structural condition on the observation-driven diffusion term. By standard measurable selection results (Kuratowski-Ryll-Nardzewski theorem), a measurable selector exists whenever the set-valued map is closed-valued and nonempty, which holds under the problem's controllability and observability assumptions. The selector acts only on the mean diffusion term and does not alter the covariance Riccati dynamics, thereby preserving the Gaussian closure. We will add an appendix providing the explicit selector construction via the theorem and a short invariance argument confirming that no state-dependent jumps or non-Gaussianity are introduced. revision: yes
Referee: [Belief dynamics and matching condition] The statement that covariance dynamics remain unaffected by selector choice (belief dynamics section) is asserted without an explicit invariance proof under the selector; if the selector depends on the instantaneous belief in a manner that couples back into the diffusion terms, the Gaussian closure and subsequent Feynman-Kac representation would fail.

Authors: The covariance evolution follows the standard Kalman-Bucy Riccati equation, which depends on the chosen observation matrix but is independent of the particular selector value once the matrix is fixed. Because the selector is chosen only to satisfy the mean-matching condition and the covariance update does not feed back into the selector definition in a way that violates closure, Gaussianity is preserved. We acknowledge the assertion was stated without a dedicated invariance lemma. In the revision we will insert a short proof in the belief-dynamics section showing that the selector, being a measurable function of the current belief, leaves the covariance ODE unchanged and therefore maintains the conditions required for the Feynman-Kac representation. revision: yes

Circularity Check

0 steps flagged

No significant circularity; derivation applies standard HJB reduction under an introduced control constraint.

full rationale

The paper defines a matching condition required for path-integral control in Gaussian belief space, treats the observation matrix as a control input to form the matching set, and then shows that selecting from this set reduces the HJB to a linear PDE admitting a Feynman-Kac representation. This is a direct substitution of the matching condition into the HJB equation rather than a self-referential definition or fitted parameter renamed as a prediction. No self-citation chain is load-bearing for the central reduction, and the derivation remains self-contained against the external benchmarks of stochastic control theory and Feynman-Kac representations. The existence of a measurable selector is an assumption whose verification lies outside the circularity patterns.

Axiom & Free-Parameter Ledger

0 free parameters · 2 axioms · 0 invented entities

The claim rests on the existence of a matching set between observation-driven diffusion and actuation, plus the ability to select from it measurably while keeping the belief Gaussian; these are domain assumptions in stochastic control rather than new postulates.

axioms (2)

domain assumption Path integral control applies in Gaussian belief space
The abstract explicitly frames the problem within Gaussian belief space for path integral control.
domain assumption A structural matching condition between observation-driven diffusion of the belief mean and actuation authority is required
The abstract states this condition as necessary and notes that a fixed observation matrix cannot enforce it.

pith-pipeline@v0.9.0 · 5353 in / 1331 out tokens · 46533 ms · 2026-05-10T02:45:50.941520+00:00 · methodology

discussion (0)

Reference graph

Works this paper leans on

18 extracted references · 18 canonical work pages

[1]

Path integrals and symmetry breaking for optimal control theory,

H. J. Kappen, “Path integrals and symmetry breaking for optimal control theory,”J. Stat. Mech.: Theory Exp., vol. 2005, no. 11, p. P11011, 2005

work page 2005
[2]

Linear theory for control of nonlinear stochastic systems,

——, “Linear theory for control of nonlinear stochastic systems,” Phys. Rev. Lett., vol. 95, no. 20, p. 200201, 2005

work page 2005
[3]

Linearly-solvable Markov decision problems,

E. Todorov, “Linearly-solvable Markov decision problems,”Adv. Neu- ral Inf. Process. Syst., vol. 19, 2006

work page 2006
[4]

A generalized path integral control approach to reinforcement learning,

E. Theodorou, J. Buchli, and S. Schaal, “A generalized path integral control approach to reinforcement learning,”J. Mach. Learn. Res., vol. 11, pp. 3137–3181, 2010

work page 2010
[5]

Model predictive path integral control: From theory to parallel computation,

G. Williams, A. Aldrich, and E. A. Theodorou, “Model predictive path integral control: From theory to parallel computation,”J. Guid. Control Dyn., vol. 40, no. 2, pp. 344–357, 2017

work page 2017
[6]

Information-theoretic model predictive control: Theory and applica- tions to autonomous driving,

G. Williams, P. Drews, B. Goldfain, J. M. Rehg, and E. A. Theodorou, “Information-theoretic model predictive control: Theory and applica- tions to autonomous driving,”IEEE Trans. Robot., vol. 34, no. 6, pp. 1603–1622, 2018

work page 2018
[7]

Path integral control and state- dependent feedback,

S. Thijssen and H. J. Kappen, “Path integral control and state- dependent feedback,”Phys. Rev. E, vol. 91, no. 3, p. 032104, 2015

work page 2015
[8]

Robust model predictive path integral control: Analysis and performance guarantees,

M. S. Gandhi, B. Vlahov, J. Gibson, G. Williams, and E. A. Theodorou, “Robust model predictive path integral control: Analysis and performance guarantees,”IEEE Robot. Autom. Lett., vol. 6, no. 2, pp. 1423–1430, 2021

work page 2021
[9]

Risk-aware model predictive path integral control,

J. Yin, A. Iyer, and E. A. Theodorou, “Risk-aware model predictive path integral control,” inProc. Amer. Control Conf. (ACC), 2023, pp. 3467–3474

work page 2023
[10]

Dynamical equations for optimal nonlinear filtering,

H. J. Kushner, “Dynamical equations for optimal nonlinear filtering,” J. Differ. Equ., vol. 3, no. 2, pp. 179–190, 1967

work page 1967
[11]

R. S. Liptser and A. N. Shiryaev,Statistics of Random Processes: General Theory. Springer, 1977, vol. 394

work page 1977
[12]

Bensoussan,Stochastic Control of Partially Observable Systems

A. Bensoussan,Stochastic Control of Partially Observable Systems. Cambridge Univ. Press, 1992

work page 1992
[13]

Krishnamurthy,Partially Observed Markov Decision Processes

V . Krishnamurthy,Partially Observed Markov Decision Processes. Cambridge Univ. Press, 2016

work page 2016
[14]

Be- lief space planning assuming maximum likelihood observations,

R. Platt Jr, R. Tedrake, L. Kaelbling, and T. Lozano-Perez, “Be- lief space planning assuming maximum likelihood observations,” in Robot.: Sci. Syst. (RSS), 2010

work page 2010
[15]

Motion planning under uncertainty using iterative local optimization in belief space,

J. Van Den Berg, S. Patil, and R. Alterovitz, “Motion planning under uncertainty using iterative local optimization in belief space,”Int. J. Robot. Res., vol. 31, no. 11, pp. 1263–1278, 2012

work page 2012
[16]

A generalized iterative LQG method for locally-optimal feedback control of constrained nonlinear stochastic systems,

E. Todorov and W. Li, “A generalized iterative LQG method for locally-optimal feedback control of constrained nonlinear stochastic systems,” inProc. Amer. Control Conf. (ACC), 2005, pp. 300–306

work page 2005
[17]

Path integral control in Gaussian belief space for partially observed systems,

G. Das and T. Tanaka, “Path integral control in Gaussian belief space for partially observed systems,”arXiv preprint, 2026, available: https: //doi.org/10.XXXX/arXiv.XXXX.XXXXX

work page 2026
[18]

Karatzas and S

I. Karatzas and S. E. Shreve,Brownian Motion and Stochastic Calcu- lus, 2nd ed. Springer, 1991

work page 1991

[1] [1]

Path integrals and symmetry breaking for optimal control theory,

H. J. Kappen, “Path integrals and symmetry breaking for optimal control theory,”J. Stat. Mech.: Theory Exp., vol. 2005, no. 11, p. P11011, 2005

work page 2005

[2] [2]

Linear theory for control of nonlinear stochastic systems,

——, “Linear theory for control of nonlinear stochastic systems,” Phys. Rev. Lett., vol. 95, no. 20, p. 200201, 2005

work page 2005

[3] [3]

Linearly-solvable Markov decision problems,

E. Todorov, “Linearly-solvable Markov decision problems,”Adv. Neu- ral Inf. Process. Syst., vol. 19, 2006

work page 2006

[4] [4]

A generalized path integral control approach to reinforcement learning,

E. Theodorou, J. Buchli, and S. Schaal, “A generalized path integral control approach to reinforcement learning,”J. Mach. Learn. Res., vol. 11, pp. 3137–3181, 2010

work page 2010

[5] [5]

Model predictive path integral control: From theory to parallel computation,

G. Williams, A. Aldrich, and E. A. Theodorou, “Model predictive path integral control: From theory to parallel computation,”J. Guid. Control Dyn., vol. 40, no. 2, pp. 344–357, 2017

work page 2017

[6] [6]

Information-theoretic model predictive control: Theory and applica- tions to autonomous driving,

G. Williams, P. Drews, B. Goldfain, J. M. Rehg, and E. A. Theodorou, “Information-theoretic model predictive control: Theory and applica- tions to autonomous driving,”IEEE Trans. Robot., vol. 34, no. 6, pp. 1603–1622, 2018

work page 2018

[7] [7]

Path integral control and state- dependent feedback,

S. Thijssen and H. J. Kappen, “Path integral control and state- dependent feedback,”Phys. Rev. E, vol. 91, no. 3, p. 032104, 2015

work page 2015

[8] [8]

Robust model predictive path integral control: Analysis and performance guarantees,

M. S. Gandhi, B. Vlahov, J. Gibson, G. Williams, and E. A. Theodorou, “Robust model predictive path integral control: Analysis and performance guarantees,”IEEE Robot. Autom. Lett., vol. 6, no. 2, pp. 1423–1430, 2021

work page 2021

[9] [9]

Risk-aware model predictive path integral control,

J. Yin, A. Iyer, and E. A. Theodorou, “Risk-aware model predictive path integral control,” inProc. Amer. Control Conf. (ACC), 2023, pp. 3467–3474

work page 2023

[10] [10]

Dynamical equations for optimal nonlinear filtering,

H. J. Kushner, “Dynamical equations for optimal nonlinear filtering,” J. Differ. Equ., vol. 3, no. 2, pp. 179–190, 1967

work page 1967

[11] [11]

R. S. Liptser and A. N. Shiryaev,Statistics of Random Processes: General Theory. Springer, 1977, vol. 394

work page 1977

[12] [12]

Bensoussan,Stochastic Control of Partially Observable Systems

A. Bensoussan,Stochastic Control of Partially Observable Systems. Cambridge Univ. Press, 1992

work page 1992

[13] [13]

Krishnamurthy,Partially Observed Markov Decision Processes

V . Krishnamurthy,Partially Observed Markov Decision Processes. Cambridge Univ. Press, 2016

work page 2016

[14] [14]

Be- lief space planning assuming maximum likelihood observations,

R. Platt Jr, R. Tedrake, L. Kaelbling, and T. Lozano-Perez, “Be- lief space planning assuming maximum likelihood observations,” in Robot.: Sci. Syst. (RSS), 2010

work page 2010

[15] [15]

Motion planning under uncertainty using iterative local optimization in belief space,

J. Van Den Berg, S. Patil, and R. Alterovitz, “Motion planning under uncertainty using iterative local optimization in belief space,”Int. J. Robot. Res., vol. 31, no. 11, pp. 1263–1278, 2012

work page 2012

[16] [16]

A generalized iterative LQG method for locally-optimal feedback control of constrained nonlinear stochastic systems,

E. Todorov and W. Li, “A generalized iterative LQG method for locally-optimal feedback control of constrained nonlinear stochastic systems,” inProc. Amer. Control Conf. (ACC), 2005, pp. 300–306

work page 2005

[17] [17]

Path integral control in Gaussian belief space for partially observed systems,

G. Das and T. Tanaka, “Path integral control in Gaussian belief space for partially observed systems,”arXiv preprint, 2026, available: https: //doi.org/10.XXXX/arXiv.XXXX.XXXXX

work page 2026

[18] [18]

Karatzas and S

I. Karatzas and S. E. Shreve,Brownian Motion and Stochastic Calcu- lus, 2nd ed. Springer, 1991

work page 1991