Dual Control of Linear Systems from Bilinear Observations with Belief Space Model Predictive Control

Andrew Lowitt; Beixi Du; Daniel Cao; Sarah Dean; Sunmook Choi; Yahya Sattar

arxiv: 2604.24663 · v1 · submitted 2026-04-27 · 🧮 math.OC · cs.LG· cs.SY· eess.SY

Dual Control of Linear Systems from Bilinear Observations with Belief Space Model Predictive Control

Daniel Cao , Beixi Du , Andrew Lowitt , Sunmook Choi , Sarah Dean , Yahya Sattar This is my paper

Pith reviewed 2026-05-08 02:20 UTC · model grok-4.3

classification 🧮 math.OC cs.LGcs.SYeess.SY

keywords dual controlbelief space MPCbilinear observationsinput-dependent Kalman filterseparation principlelinear quadratic controlmodel predictive control

0 comments

The pith

Belief-space MPC plans over state estimates and input-dependent covariances to improve control when actions affect observation quality.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

The paper addresses finite-horizon quadratic control of linear systems whose observations are bilinear in the state and control input. Because the control directly influences future measurement quality, the classical separation of estimation and control no longer holds. The authors introduce belief-space model predictive control that optimizes sequences of inputs while propagating both the estimated state and the evolving error covariance produced by an input-dependent Kalman filter. A deterministic surrogate replaces the stochastic belief update inside the planner. Numerical experiments on two synthetic problems show that this approach yields lower estimation error and higher performance than separation-principle controllers or their MPC variants whenever better observations are available.

Core claim

In finite-horizon quadratic control of linear systems with bilinear observations, the separation principle fails because control inputs affect the future quality of state estimates obtained from an input-dependent Kalman filter. Belief-space model predictive control (B-MPC) addresses this by planning directly over both the estimated state and its error covariance, using a deterministic surrogate of the belief evolution. In synthetic numerical tests this produces lower estimation covariance and more uncertainty-aware actions than either the separation-principle controller or its MPC variant.

What carries the argument

Belief-space model predictive control (B-MPC) that optimizes over state estimates and the deterministic trajectory of the input-dependent Kalman-filter covariance matrix.

If this is right

B-MPC outperforms separation-principle controllers and their MPC variants in regimes where control inputs improve future observation quality.
The method produces lower closed-loop estimation covariance than non-dual controllers.
Selected actions explicitly trade off immediate cost against future reduction in uncertainty.
The approach applies to any finite-horizon linear-quadratic problem whose observation model is bilinear in state and input.

Where Pith is reading between the lines

These are editorial extensions of the paper, not claims the author makes directly.

The same deterministic-surrogate idea could be tested on systems whose observation model is only approximately bilinear or mildly nonlinear.
Replacing the deterministic covariance propagation with sampled trajectories inside the planner would quantify the approximation error for highly stochastic regimes.
The framework suggests that explicit modeling of information-gathering value can be added to standard MPC without leaving the linear-quadratic setting.

Load-bearing premise

The deterministic surrogate of the stochastic belief evolution defined by the input-dependent Kalman filter is sufficiently accurate to produce effective control plans despite the underlying randomness in states and observations.

What would settle it

A side-by-side run on the same linear system in which the planner uses the deterministic covariance surrogate versus a version that draws full stochastic realizations of the belief trajectory; if the performance gap disappears or reverses under high process noise, the surrogate approximation is the limiting factor.

Figures

Figures reproduced from arXiv: 2604.24663 by Andrew Lowitt, Beixi Du, Daniel Cao, Sarah Dean, Sunmook Choi, Yahya Sattar.

**Figure 1.** Figure 1: Total cost versus look-ahead horizon H ∈ {5, 10, 15, 20, 25, 30} for the three controllers, averaged over 10 trials. Shaded regions denote 95% confidence intervals across trials. Left: random bilinear observation system. Right: multi-block double integrator system. Under the parameter setting in view at source ↗

**Figure 2.** Figure 2: Kalman filter diagnostics on the multi-block double integrator system with view at source ↗

**Figure 3.** Figure 3: Counterfactual action comparison between view at source ↗

**Figure 4.** Figure 4: Action difference ∥u B-MPC − u Sep-MPC∥2 versus tr(Σ) for synthetic belief states. Results are computed from 10 sampled state estimates combined with 20 log-spaced covariance scales. The black line shows the median across synthetic beliefs, and the shaded region indicates the interquartile range view at source ↗

**Figure 5.** Figure 5: Percentage cost improvement of B-MPC over Sep across Rscale and c0 settings for the random bilinear and double-integrator systems, with ρ(A) = 0.95. For each (Rscale, c0) pair, we select the horizon H that minimizes the mean B-MPC cost over 10 trials, and report the relative gain 100×(JSep−JB-MPC)/JSep (%). Each cell is annotated with the corresponding percentage. In view at source ↗

**Figure 6.** Figure 6: Total cost versus spectral radius for the three controllers, where view at source ↗

**Figure 7.** Figure 7: Computation time versus planning horizon view at source ↗

**Figure 8.** Figure 8: Total rollout cost versus the maximum number of L-BFGS iterations for view at source ↗

read the original abstract

We study finite-horizon quadratic control of linear systems with bilinear observations, in which the control input affects not only the state dynamics but also the partial observations of the state. In this setting, the separation principle can fail because control inputs influence the future quality of state estimates. State estimation requires an input-dependent Kalman filter whose gain and error covariance evolve as functions of the control inputs. To address this challenge, we propose a belief-space model predictive control ($\texttt{B-MPC}$) method that plans directly over both the estimated state and its error covariance. In particular, $\texttt{B-MPC}$ plans with a deterministic surrogate of the belief evolution defined by the input-dependent Kalman filter. Through numerical experiments in two synthetic settings, we show that $\texttt{B-MPC}$ can outperform both the separation-principle controller and its MPC variant in favorable regimes, and that these gains are accompanied by lower estimation covariance and more uncertainty-aware action choices.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Desk Editor's Note private letter to a colleague

This paper gives a workable B-MPC extension for dual control in linear systems with bilinear observations by planning over a deterministic surrogate of the input-dependent Kalman filter belief.

read the letter

The main point is that this paper develops a belief-space MPC for linear systems where the control input affects both the state evolution and the quality of partial observations through a bilinear model. It uses a deterministic surrogate of the input-dependent Kalman filter's belief evolution to make planning tractable. What the work does well is integrate the estimation covariance directly into the optimization. By planning over both the mean estimate and the covariance matrix, the controller can choose inputs that reduce uncertainty when it matters for the control objective. The two synthetic numerical examples illustrate this, showing lower estimation covariances and more informed action selection compared to controllers that ignore the dual effect. The soft spots are in the experimental section. The abstract refers to performance in favorable regimes without providing the underlying system matrices, simulation parameters, or any quantitative details on the comparisons. This leaves open questions about how broadly the advantages hold and whether the results depend on particular choices. The deterministic surrogate is key to the method, but there is no discussion of its approximation error or conditions under which it remains effective. This paper is aimed at people in optimal control and robotics who deal with active sensing or adaptive control tasks. A reader familiar with Kalman filtering and MPC will see the extension clearly. It shows honest engagement with the separation principle failure in this setting and builds a workable algorithm from standard components. I think it deserves a serious referee. The core idea is sound and the initial results are promising enough to warrant expert input on the experiments and potential extensions. I would send it for peer review.

Referee Report

1 major / 0 minor

Summary. The paper studies finite-horizon quadratic control of linear systems with bilinear observations, in which control inputs affect both state dynamics and observation quality. It proposes a belief-space model predictive control (B-MPC) method that plans directly over the estimated state and its error covariance using a deterministic surrogate of the input-dependent Kalman filter belief evolution. Numerical experiments in two synthetic settings are used to claim that B-MPC outperforms both the separation-principle controller and its MPC variant in favorable regimes, with accompanying reductions in estimation covariance and more uncertainty-aware actions.

Significance. If the empirical claims hold, the work provides a pragmatic approximation method for dual control settings where the separation principle fails. The deterministic surrogate of input-dependent belief evolution is a reasonable modeling choice that could enable uncertainty-aware planning in applications such as sensor scheduling or active perception.

major comments (1)

[Numerical experiments] Numerical experiments section: the abstract and description report outperformance in two synthetic settings but supply no details on experiment design, baseline implementations, number of Monte Carlo trials, statistical tests, or the precise parameter regimes tested. The qualification to 'favorable regimes' therefore cannot be evaluated for robustness or risk of post-hoc selection.

Simulated Author's Rebuttal

1 responses · 0 unresolved

We thank the referee for their careful reading and constructive feedback on our manuscript. We address the single major comment below and agree that additional details are needed to strengthen the presentation of the numerical results.

read point-by-point responses

Referee: Numerical experiments section: the abstract and description report outperformance in two synthetic settings but supply no details on experiment design, baseline implementations, number of Monte Carlo trials, statistical tests, or the precise parameter regimes tested. The qualification to 'favorable regimes' therefore cannot be evaluated for robustness or risk of post-hoc selection.

Authors: We agree that the current numerical experiments section does not provide sufficient detail for readers to assess the robustness of the reported outperformance. In the revised manuscript we will expand the section to include: (i) explicit descriptions of the two synthetic system models, including all parameter values, initial conditions, and horizon lengths; (ii) implementation details for the separation-principle controller and its MPC variant (e.g., how the Kalman filter is run, any approximations used); (iii) the exact number of Monte Carlo trials performed for each reported result; (iv) the statistical measures or tests used to compare controllers; and (v) a clearer delineation of the parameter regimes tested, together with additional results or discussion showing that the favorable regimes were identified a priori rather than through post-hoc selection. These additions will allow the claims to be evaluated more rigorously. revision: yes

Circularity Check

0 steps flagged

No significant circularity in derivation chain

full rationale

The paper defines B-MPC directly from the standard input-dependent Kalman filter equations for belief evolution (with deterministic surrogate) combined with quadratic MPC planning over estimated state and covariance. No load-bearing step reduces by construction to a fitted parameter, self-citation chain, or renamed input; the separation-principle baseline and its MPC variant are external comparators, and the numerical experiments in synthetic settings serve as independent empirical validation rather than tautological prediction. The modeling choice is explicitly stated as an approximation whose effectiveness is tested, not assumed by definition.

Axiom & Free-Parameter Ledger

0 free parameters · 2 axioms · 1 invented entities

The paper rests on standard linear-Gaussian control assumptions and introduces one new algorithmic construct (the deterministic belief surrogate) without additional free parameters or invented physical entities.

axioms (2)

domain assumption System dynamics are linear and observations are bilinear in the control input
Core modeling choice stated in the problem setup.
standard math Finite-horizon quadratic cost and Gaussian noise
Standard assumptions enabling Kalman filter and quadratic MPC.

invented entities (1)

Deterministic surrogate of belief evolution no independent evidence
purpose: Approximates stochastic belief dynamics to enable deterministic MPC planning
New construct introduced to make belief-space planning tractable

pith-pipeline@v0.9.0 · 5482 in / 1263 out tokens · 59883 ms · 2026-05-08T02:20:28.845644+00:00 · methodology

discussion (0)

Reference graph

Works this paper leans on

17 extracted references · 17 canonical work pages

[1]

Explore-then-commit for nonstationary linear bandits with latent dynamics

Sunmook Choi, Yahya Sattar, Yassir Jedra, Maryam Fazel, and Sarah Dean. Explore-then-commit for nonstationary linear bandits with latent dynamics. arXiv preprint arXiv:2510.16208 , 2025

work page arXiv 2025
[2]

u ne and J \

Lars Gr \"u ne and J \"u rgen Pannek. Nonlinear model predictive control. In Nonlinear model predictive control: Theory and algorithms , pages 45--69. Springer, 2016

work page 2016
[3]

A new approach to linear filtering and prediction problems

Rudolph Emil Kalman. A new approach to linear filtering and prediction problems. 1960

work page 1960
[4]

Champion-level drone racing using deep reinforcement learning

Elia Kaufmann, Leonard Bauersfeld, Antonio Loquercio, Matthias M \"u ller, Vladlen Koltun, and Davide Scaramuzza. Champion-level drone racing using deep reinforcement learning. Nature , 620(7976):982--987, 2023

work page 2023
[5]

Model predictive control of bilinear systems as uncertain linear systems

Sahand Hadizadeh Kafash, Justin Koeln, and Justin Ruths. Model predictive control of bilinear systems as uncertain linear systems. In 2022 IEEE Conference on Control Technology and Applications (CCTA) , pages 562--567. IEEE, 2022

work page 2022
[6]

Littman, and Anthony R

Leslie Pack Kaelbling, Michael L. Littman, and Anthony R. Cassandra. Planning and acting in partially observable stochastic domains. Artificial Intelligence , 101(1--2):99--134, 1998

work page 1998
[7]

System identification for linear dynamics with bilinear observation models: An expectation–maximization approach

Diyou Liu and Mohammad Khosravi. System identification for linear dynamics with bilinear observation models: An expectation–maximization approach. In 2024 IEEE 63rd Conference on Decision and Control (CDC) , pages 7190--7195, 2024

work page 2024
[8]

Model predictive control: Recent developments and future promise

David Q Mayne. Model predictive control: Recent developments and future promise. Automatica , 50(12):2967--2986, 2014

work page 2014
[9]

State estimation and belief space planning under epistemic uncertainty for learning-based perception systems

Keiko Nagami and Mac Schwager. State estimation and belief space planning under epistemic uncertainty for learning-based perception systems. IEEE Robotics and Automation Letters , 9(6):5118--5125, 2024

work page 2024
[10]

Papadimitriou and John N

Christos H. Papadimitriou and John N. Tsitsiklis. The complexity of markov decision processes. Mathematics of Operations Research , 12(3):441--450, 1987

work page 1987
[11]

Optimization and control of bilinear systems: theory, algorithms, and applications , volume 11

Panos M Pardalos and Vitaliy A Yatsenko. Optimization and control of bilinear systems: theory, algorithms, and applications , volume 11. Springer Science & Business Media, 2010

work page 2010
[12]

Rawlings, D.Q

J.B. Rawlings, D.Q. Mayne, and M. Diehl. Model Predictive Control: Theory, Computation, and Design . Nob Hill Publishing, LLC, 2024

work page 2024
[13]

Sub-optimality of the separation principle for quadratic control from bilinear observations

Yahya Sattar, Sunmook Choi, Yassir Jedra, Maryam Fazel, and Sarah Dean. Sub-optimality of the separation principle for quadratic control from bilinear observations. In 2025 IEEE 64th Conference on Decision and Control (CDC) , pages 3862--3867. IEEE, 2025

work page 2025
[14]

Learning linear dynamics from bilinear observations

Yahya Sattar, Yassir Jedra, and Sarah Dean. Learning linear dynamics from bilinear observations. In 2025 American Control Conference (ACC) , pages 3109--3115. IEEE, 2025

work page 2025
[15]

a sser, and Frank Allg \

Yifan Xie, Julian Berberich, Robin Str \"a sser, and Frank Allg \"o wer. Bilinear data-driven min-max mpc: Designing rational controllers via sum-of-squares optimization. In 2025 IEEE 64th Conference on Decision and Control (CDC) , pages 1042--1047. IEEE, 2025

work page 2025
[16]

Optimal sampling-based motion planning in gaussian belief space for minimum sensing navigation, 2023

Vrushabh Zinage, Ali Reza Pedram, and Takashi Tanaka. Optimal sampling-based motion planning in gaussian belief space for minimum sensing navigation, 2023

work page 2023
[17]

Belief space planning: A covariance steering approach

Dongliang Zheng, Jack Ridderhof, Panagiotis Tsiotras, and Ali-akbar Agha-mohammadi. Belief space planning: A covariance steering approach. In 2022 International Conference on Robotics and Automation (ICRA) , pages 11051--11057. IEEE, 2022

work page 2022

[1] [1]

Explore-then-commit for nonstationary linear bandits with latent dynamics

Sunmook Choi, Yahya Sattar, Yassir Jedra, Maryam Fazel, and Sarah Dean. Explore-then-commit for nonstationary linear bandits with latent dynamics. arXiv preprint arXiv:2510.16208 , 2025

work page arXiv 2025

[2] [2]

u ne and J \

Lars Gr \"u ne and J \"u rgen Pannek. Nonlinear model predictive control. In Nonlinear model predictive control: Theory and algorithms , pages 45--69. Springer, 2016

work page 2016

[3] [3]

A new approach to linear filtering and prediction problems

Rudolph Emil Kalman. A new approach to linear filtering and prediction problems. 1960

work page 1960

[4] [4]

Champion-level drone racing using deep reinforcement learning

Elia Kaufmann, Leonard Bauersfeld, Antonio Loquercio, Matthias M \"u ller, Vladlen Koltun, and Davide Scaramuzza. Champion-level drone racing using deep reinforcement learning. Nature , 620(7976):982--987, 2023

work page 2023

[5] [5]

Model predictive control of bilinear systems as uncertain linear systems

Sahand Hadizadeh Kafash, Justin Koeln, and Justin Ruths. Model predictive control of bilinear systems as uncertain linear systems. In 2022 IEEE Conference on Control Technology and Applications (CCTA) , pages 562--567. IEEE, 2022

work page 2022

[6] [6]

Littman, and Anthony R

Leslie Pack Kaelbling, Michael L. Littman, and Anthony R. Cassandra. Planning and acting in partially observable stochastic domains. Artificial Intelligence , 101(1--2):99--134, 1998

work page 1998

[7] [7]

System identification for linear dynamics with bilinear observation models: An expectation–maximization approach

Diyou Liu and Mohammad Khosravi. System identification for linear dynamics with bilinear observation models: An expectation–maximization approach. In 2024 IEEE 63rd Conference on Decision and Control (CDC) , pages 7190--7195, 2024

work page 2024

[8] [8]

Model predictive control: Recent developments and future promise

David Q Mayne. Model predictive control: Recent developments and future promise. Automatica , 50(12):2967--2986, 2014

work page 2014

[9] [9]

State estimation and belief space planning under epistemic uncertainty for learning-based perception systems

Keiko Nagami and Mac Schwager. State estimation and belief space planning under epistemic uncertainty for learning-based perception systems. IEEE Robotics and Automation Letters , 9(6):5118--5125, 2024

work page 2024

[10] [10]

Papadimitriou and John N

Christos H. Papadimitriou and John N. Tsitsiklis. The complexity of markov decision processes. Mathematics of Operations Research , 12(3):441--450, 1987

work page 1987

[11] [11]

Optimization and control of bilinear systems: theory, algorithms, and applications , volume 11

Panos M Pardalos and Vitaliy A Yatsenko. Optimization and control of bilinear systems: theory, algorithms, and applications , volume 11. Springer Science & Business Media, 2010

work page 2010

[12] [12]

Rawlings, D.Q

J.B. Rawlings, D.Q. Mayne, and M. Diehl. Model Predictive Control: Theory, Computation, and Design . Nob Hill Publishing, LLC, 2024

work page 2024

[13] [13]

Sub-optimality of the separation principle for quadratic control from bilinear observations

Yahya Sattar, Sunmook Choi, Yassir Jedra, Maryam Fazel, and Sarah Dean. Sub-optimality of the separation principle for quadratic control from bilinear observations. In 2025 IEEE 64th Conference on Decision and Control (CDC) , pages 3862--3867. IEEE, 2025

work page 2025

[14] [14]

Learning linear dynamics from bilinear observations

Yahya Sattar, Yassir Jedra, and Sarah Dean. Learning linear dynamics from bilinear observations. In 2025 American Control Conference (ACC) , pages 3109--3115. IEEE, 2025

work page 2025

[15] [15]

a sser, and Frank Allg \

Yifan Xie, Julian Berberich, Robin Str \"a sser, and Frank Allg \"o wer. Bilinear data-driven min-max mpc: Designing rational controllers via sum-of-squares optimization. In 2025 IEEE 64th Conference on Decision and Control (CDC) , pages 1042--1047. IEEE, 2025

work page 2025

[16] [16]

Optimal sampling-based motion planning in gaussian belief space for minimum sensing navigation, 2023

Vrushabh Zinage, Ali Reza Pedram, and Takashi Tanaka. Optimal sampling-based motion planning in gaussian belief space for minimum sensing navigation, 2023

work page 2023

[17] [17]

Belief space planning: A covariance steering approach

Dongliang Zheng, Jack Ridderhof, Panagiotis Tsiotras, and Ali-akbar Agha-mohammadi. Belief space planning: A covariance steering approach. In 2022 International Conference on Robotics and Automation (ICRA) , pages 11051--11057. IEEE, 2022

work page 2022