Rate-Cost Tradeoffs in Nonlinear Control
Pith reviewed 2026-05-09 23:12 UTC · model grok-4.3
The pith
For general nonlinear stochastic control, the minimal rate to meet a control cost D lies within a logarithmic additive gap of the directed-information minimum.
A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.
Core claim
We show that F_n(D) ≤ R_n(D) ≤ F_n(D) + log(F_n(D) + 3.4) + 2 + 1/n bits, where F_n(D) is the minimum directed information between the state and control sequences subject to an average cost constraint D. The lower bound follows from standard information inequalities, while the upper bound is achieved constructively by applying the strong functional representation lemma to the conditional distributions at each time step. This establishes directed information as the operationally relevant quantity for rate-limited control of general nonlinear systems and recovers known nonasymptotic bounds for causal rate-distortion and LQG control as special cases.
What carries the argument
The strong functional representation lemma applied sequentially to the conditional distributions of the stochastic control system, which constructs an encoding-and-control policy achieving the stated upper bound on R_n(D).
Load-bearing premise
An encoding-and-control policy can be constructed at each time step using the strong functional representation lemma applied to the conditional distributions of the general stochastic control system.
What would settle it
A concrete finite-horizon nonlinear system and cost threshold D for which the true minimal rate exceeds F_n(D) + log(F_n(D) + 3.4) + 2 + 1/n bits.
Figures
read the original abstract
We study the rate-cost tradeoff in rate-limited control of general stochastic control systems, including nonlinear systems, over a finite horizon. At each time step, an encoder observes the state and transmits a description to a controller, which then selects the control action. For an average control-cost threshold $D$, we characterize the minimum achievable communication rate $R_n(D)$ via a nonasymptotic bound: $R_n(D)$ lies within an additive logarithmic gap of the optimal value of a directed-information minimization $F_n(D)$, namely, we show that $F_n(D) \le R_n(D) \le F_n(D)+\log \bigl(F_n(D)+3.4\bigr)+2+\frac{1}{n}$, in bits. This establishes directed information as the operationally relevant quantity governing rate-limited control, thereby broadening its utility beyond its previously established roles in causal source coding and linear quadratic Gaussian (LQG) control to general nonlinear control systems. We prove the upper bound constructively by building an encoding-and-control policy using the strong functional representation lemma at each time step. As special cases of our setting, our framework yields nonasymptotic bounds for sequential (causal) rate-distortion and LQG control.
Editorial analysis
A structured set of objections, weighed in public.
Referee Report
Summary. The paper considers finite-horizon rate-cost tradeoffs for general (including nonlinear) stochastic control systems in which an encoder observes the state and sends a description to a controller that chooses the action. For a given average cost threshold D it defines R_n(D) as the minimum achievable rate and F_n(D) as the minimum directed information. It claims the non-asymptotic sandwich F_n(D) ≤ R_n(D) ≤ F_n(D) + log(F_n(D) + 3.4) + 2 + 1/n (in bits) and proves the upper bound constructively by applying the strong functional representation lemma at each time step to the conditional distributions P(U_t | X^t, U^{t-1}). Special cases recover non-asymptotic bounds for causal rate-distortion and LQG control.
Significance. If the claimed gap holds, the result would make directed information the operationally relevant quantity for rate-limited nonlinear control, extending its known roles in causal source coding and LQG. The non-asymptotic character and explicit constructive policy are positive features.
major comments (1)
- [Abstract] Abstract (upper-bound construction): the claimed additive gap is only logarithmic in F_n(D). The stated proof applies the strong functional representation lemma separately at each of the n time steps, each application contributing an overhead of log(I_t + 3.4) + O(1) where I_t = I(X_t; U_t | past). The total overhead is therefore ∑ log(I_t + 3.4) + O(n). Jensen’s inequality then yields at most n log(F_n/n + 3.4) + O(n), which is linear in n rather than logarithmic in F_n. No global block-coding, shared randomness, or non-causal compression step that would reduce the overhead to a single log(F_n) term while preserving causality and the state-dependent dynamics is described. This discrepancy is load-bearing for the central sandwich claim.
minor comments (1)
- The abstract states that the framework yields non-asymptotic bounds for sequential rate-distortion and LQG as special cases, but the precise specialization steps (e.g., how the general directed-information functional reduces to the known LQG expression) are not visible in the provided text.
Simulated Author's Rebuttal
We thank the referee for the detailed review and for highlighting this critical point regarding our upper bound. We provide a point-by-point response below.
read point-by-point responses
-
Referee: [Abstract] Abstract (upper-bound construction): the claimed additive gap is only logarithmic in F_n(D). The stated proof applies the strong functional representation lemma separately at each of the n time steps, each application contributing an overhead of log(I_t + 3.4) + O(1) where I_t = I(X_t; U_t | past). The total overhead is therefore ∑ log(I_t + 3.4) + O(n). Jensen’s inequality then yields at most n log(F_n/n + 3.4) + O(n), which is linear in n rather than logarithmic in F_n. No global block-coding, shared randomness, or non-causal compression step that would reduce the overhead to a single log(F_n) term while preserving causality and the state-dependent dynamics is described. This discrepancy is load-bearing for the central sandwich claim.
Authors: We appreciate the referee's careful analysis of the upper bound construction. The manuscript indeed proposes applying the strong functional representation lemma at each time step to construct the encoding policy. As the referee correctly notes, this leads to a cumulative overhead of ∑_t log(I_t + 3.4) + O(n). Using the concavity of the logarithm and Jensen's inequality, the overhead is at most n log(F_n(D)/n + 3.4) + O(n), which is linear in the horizon n. The paper does not provide a mechanism such as shared randomness across time steps or a block-coding approach that preserves causality to achieve a single logarithmic term. We acknowledge that the claimed bound F_n(D) ≤ R_n(D) ≤ F_n(D) + log(F_n(D) + 3.4) + 2 + 1/n does not follow from the described construction. In the revised version, we will update the abstract, the statement of the main result, and the proof to reflect the correct overhead, changing the upper bound to R_n(D) ≤ F_n(D) + n log(F_n(D)/n + 3.4) + C for some constant C. We will also discuss whether a logarithmic gap is achievable under additional assumptions or with a modified construction. This revision will be made. revision: yes
Circularity Check
No significant circularity; F_n(D) and R_n(D) defined independently with explicit constructive upper bound
full rationale
F_n(D) is defined as the minimum directed information I(X^n → U^n) subject to the average cost constraint, while R_n(D) is the infimum operational rate over all causal encoding-control policies achieving the same cost. The lower bound follows from standard directed-information data-processing inequalities that hold for any joint distribution induced by a policy. The upper bound is established by an explicit per-step construction that invokes the strong functional representation lemma on the conditional distributions P(U_t | X^t, U^{t-1}), producing a concrete policy whose rate is bounded by F_n(D) plus the stated logarithmic term. This construction is independent of the definition of F_n(D) and does not rely on fitting parameters to data, self-referential definitions, or load-bearing self-citations. The derivation chain therefore remains self-contained against external benchmarks.
Axiom & Free-Parameter Ledger
axioms (1)
- standard math The strong functional representation lemma applies to the relevant conditional distributions at each time step of the control system.
Reference graph
Works this paper leans on
-
[1]
Agreement in teams and the dynamic program- ming approach under information constraints,
S. Y ¨uksel and T. Bas ¸ar, “Agreement in teams and the dynamic program- ming approach under information constraints,” inStochastic Networked Control Systems: Stabilization and Optimization under Information Constraints, ser. Systems & Control: Foundations & Applications. New York, NY: Birkh¨auser, May 2013, pp. 399–421
work page 2013
-
[2]
L. Zhao, Y .-K. Chia, and T. Weissman, “Compression with actions,” IEEE Transactions on Information Theory, vol. 60, no. 2, pp. 796–807, Feb. 2014
work page 2014
-
[3]
Rate-cost tradeoffs in scalar LQG control and tracking with side information,
V . Kostina and B. Hassibi, “Rate-cost tradeoffs in scalar LQG control and tracking with side information,” in56th Annual Allerton Conference on Communication, Control, and Computing (Allerton), Oct. 2018, pp. 421–428
work page 2018
-
[4]
Rate-cost tradeoffs in continuous-time control with a biomolecular application,
Y . Nakahira, F. Xiao, V . Kostina, and J. C. Doyle, “Rate-cost tradeoffs in continuous-time control with a biomolecular application,”IEEE Transactions on Automatic Control, vol. 71, no. 4, pp. 2614–2621, Apr. 2026
work page 2026
-
[5]
Rate-cost tradeoffs in control,
V . Kostina and B. Hassibi, “Rate-cost tradeoffs in control,”IEEE Transactions on Automatic Control, vol. 64, no. 11, pp. 4525–4540, Nov. 2019
work page 2019
-
[6]
A mathematical theory of communication,
C. E. Shannon, “A mathematical theory of communication,”Bell System Technical Journal, vol. 27, no. 3, pp. 379–423, Jul. 1948, part II: vol. 27, no. 4, pp. 623–656 (Oct. 1948)
work page 1948
-
[7]
Coding theorems for a discrete source with a fidelity criterion,
——, “Coding theorems for a discrete source with a fidelity criterion,” inIRE National Convention Record, Part 4, vol. 7, Mar. 1959, pp. 142– 163
work page 1959
-
[8]
Berger,Rate Distortion Theory: A Mathematical Basis for Data Com- pression, ser
T. Berger,Rate Distortion Theory: A Mathematical Basis for Data Com- pression, ser. Prentice-Hall Series in Information and System Sciences. Englewood Cliffs, NJ: Prentice-Hall, 1971
work page 1971
-
[9]
T. M. Cover and J. A. Thomas,Elements of Information Theory, 2nd ed. Hoboken, NJ: Wiley-Interscience, Jul. 2006
work page 2006
-
[10]
A. Gersho and R. M. Gray,Vector Quantization and Signal Compression, ser. The Kluwer International Series in Engineering and Computer Science. Boston, MA: Kluwer Academic Publishers, 1992
work page 1992
-
[11]
R. M. Gray and D. L. Neuhoff, “Quantization,”IEEE Transactions on Information Theory, vol. 44, no. 6, pp. 2325–2383, Oct. 1998
work page 1998
-
[12]
Rate-distortion in closed-loop LTI systems,
E. I. Silva, M. S. Derpich, and J. Østergaard, “Rate-distortion in closed-loop LTI systems,” in2013 Information Theory and Applications Workshop (ITA). IEEE, Feb. 2013, pp. 1–2
work page 2013
-
[13]
LQG control with minimum directed information: Semidefinite programming approach,
T. Tanaka, P. M. Esfahani, and S. K. Mitter, “LQG control with minimum directed information: Semidefinite programming approach,” IEEE Transactions on Automatic Control, vol. 63, no. 1, pp. 37–52, Jan. 2018
work page 2018
-
[14]
On the structure of real-time source coders,
H. S. Witsenhausen, “On the structure of real-time source coders,”Bell System Technical Journal, vol. 58, no. 6, pp. 1437–1451, Jul.–Aug. 1979
work page 1979
-
[15]
D. L. Neuhoff and R. K. Gilbert, “Causal source codes,”IEEE Trans- actions on Information Theory, vol. 28, no. 5, pp. 701–713, Sep. 1982
work page 1982
-
[16]
Causality, feedback and directed information,
J. L. Massey, “Causality, feedback and directed information,” inPro- ceedings of the International Symposium on Information Theory and Its Applications (ISITA), Waikiki, HI, Nov. 1990, pp. 303–305
work page 1990
-
[17]
Nonanticipatory and prognostic epsilon-entropies and message generation rates,
A. K. Gorbunov and M. S. Pinsker, “Nonanticipatory and prognostic epsilon-entropies and message generation rates,”Problems of Informa- tion Transmission, vol. 9, pp. 184–191, 1973, English translation from Problemy Peredachi Informatsii
work page 1973
-
[18]
Control under communication constraints,
S. C. Tatikonda, “Control under communication constraints,” Ph.D. dissertation, Massachusetts Institute of Technology, Cambridge, MA, Aug. 2000
work page 2000
-
[19]
Nonanticipative rate distortion function and relations to filtering theory,
C. D. Charalambous, P. A. Stavrou, and N. U. Ahmed, “Nonanticipative rate distortion function and relations to filtering theory,”IEEE Transac- tions on Automatic Control, vol. 59, no. 4, pp. 937–952, Apr. 2014
work page 2014
-
[20]
Stochastic linear control over a communication channel,
S. Tatikonda, A. Sahai, and S. K. Mitter, “Stochastic linear control over a communication channel,”IEEE Transactions on Automatic Control, vol. 49, no. 9, pp. 1549–1561, Sep. 2004
work page 2004
-
[21]
E. I. Silva, M. S. Derpich, J. Østergaard, and M. A. Encina, “A characterization of the minimal average data rate that guarantees a given closed-loop performance level,”IEEE Transactions on Automatic Control, vol. 61, no. 8, pp. 2171–2186, Aug. 2016
work page 2016
-
[22]
Algorithms for optimal control with fixed-rate feedback,
A. Khina, Y . Nakahira, Y . Su, and B. Hassibi, “Algorithms for optimal control with fixed-rate feedback,” inProceedings of the 56th IEEE Conference on Decision and Control (CDC), Melbourne, Australia, Dec. 2017, pp. 6015–6020
work page 2017
-
[23]
Reducing the LQG cost with minimal communication,
O. Sabag, P. Tian, V . Kostina, and B. Hassibi, “Reducing the LQG cost with minimal communication,”IEEE Transactions on Automatic Control, vol. 68, no. 9, pp. 5258–5270, Sep. 2023
work page 2023
-
[24]
S. Y ¨uksel and T. Bas ¸ar,Stochastic Networked Control Systems: Stabi- lization and Optimization under Information Constraints, ser. Systems & Control: Foundations and Applications. New York, NY: Birkh ¨auser, May 2013
work page 2013
-
[25]
How much information is needed in quantized nonlinear control?
C. Zheng, L. Li, L. Wang, and C. Li, “How much information is needed in quantized nonlinear control?”Science China Information Sciences, vol. 61, no. 9, p. 092205, Jan. 2018
work page 2018
-
[26]
Quantized nonlinear control—a survey,
Z. Jiang and T. Liu, “Quantized nonlinear control—a survey,”Acta Automatica Sinica, vol. 39, no. 11, pp. 1820–1830, Nov. 2013
work page 2013
-
[27]
Strong functional representation lemma and applications to coding theorems,
C. T. Li and A. E. Gamal, “Strong functional representation lemma and applications to coding theorems,”IEEE Transactions on Information Theory, vol. 64, no. 11, pp. 6967–6978, Nov. 2018
work page 2018
-
[28]
C. T. Li, “Discrete layered entropy, conditional compression and a tighter strong functional representation lemma,”arXiv preprint arXiv:2501.13736, Jan. 2025
-
[29]
Directed information on abstract spaces: Properties and variational equalities,
C. D. Charalambous and P. A. Stavrou, “Directed information on abstract spaces: Properties and variational equalities,”IEEE Transactions on Information Theory, vol. 62, no. 11, pp. 6019–6052, Nov. 2016. 11
work page 2016
-
[30]
Optimal causal coding-decoding prob- lems,
J. Walrand and P. P. Varaiya, “Optimal causal coding-decoding prob- lems,”IEEE Transactions on Information Theory, vol. 29, no. 6, pp. 814–820, Nov. 1983
work page 1983
-
[31]
Optimal sequential vector quantization of Markov sources,
V . S. Borkar, S. K. Mitter, and S. Tatikonda, “Optimal sequential vector quantization of Markov sources,”SIAM Journal on Control and Optimization, vol. 40, no. 1, pp. 135–148, Jan. 2001
work page 2001
-
[32]
On the structure of optimal real-time encoders and decoders in Markov sources,
N. Teneketzis, “On the structure of optimal real-time encoders and decoders in Markov sources,”IEEE Transactions on Information Theory, vol. 52, no. 8, pp. 3723–3735, Aug. 2006
work page 2006
-
[33]
On optimal zero-delay coding of vector Markov sources,
T. Linder and S. Y ¨uksel, “On optimal zero-delay coding of vector Markov sources,”IEEE Transactions on Information Theory, vol. 60, no. 10, pp. 5975–5991, Oct. 2014
work page 2014
-
[34]
Optimal zero delay coding of Markov sources: Stationary and finite memory codes,
R. G. Wood, T. Linder, and S. Y ¨uksel, “Optimal zero delay coding of Markov sources: Stationary and finite memory codes,”IEEE Transac- tions on Information Theory, vol. 63, no. 5, pp. 2863–2882, May 2017
work page 2017
-
[35]
M. Ghomi, T. Linder, and S. Y ¨uksel, “Zero-delay lossy coding of linear vector Markov sources: Optimality of stationary codes and near optimality of finite memory codes,”IEEE Transactions on Information Theory, vol. 68, no. 5, pp. 3474–3488, May 2022
work page 2022
-
[36]
Sequential coding of correlated sources,
H. Viswanathan and T. Berger, “Sequential coding of correlated sources,” IEEE Transactions on Information Theory, vol. 46, no. 1, pp. 236–246, Jan. 2000
work page 2000
-
[37]
E. Yang, L. Zheng, D. He, and Z. Zhang, “Rate distortion theory for causal video coding: Characterization, computation algorithm, and comparison,”IEEE Transactions on Information Theory, vol. 57, no. 8, pp. 5258–5280, Aug. 2011
work page 2011
-
[38]
The CEO problem with inter-block memory,
V . Kostina and B. Hassibi, “The CEO problem with inter-block memory,” IEEE Transactions on Information Theory, vol. 67, no. 12, pp. 7752– 7768, Dec. 2021
work page 2021
-
[39]
Sequential source coding: An optimization viewpoint,
V . S. Borkar, S. K. Mitter, A. Sahai, and S. Tatikonda, “Sequential source coding: An optimization viewpoint,” inProceedings of the 44th IEEE Conference on Decision and Control and 8th European Control Conference (CDC-ECC). Seville, Spain: IEEE, Dec. 2005, pp. 1035– 1042
work page 2005
-
[40]
Opportunistic scheduling with limited channel state information: A rate distortion approach,
M. Johnston, E. Modiano, and Y . Polyanskiy, “Opportunistic scheduling with limited channel state information: A rate distortion approach,” inProceedings of the IEEE International Symposium on Information Theory (ISIT). Honolulu, HI, USA: IEEE, Jun. 2014, pp. 1371–1375
work page 2014
-
[41]
On delayed sequential coding of correlated sources,
N. Ma and P. Ishwar, “On delayed sequential coding of correlated sources,”IEEE Transactions on Information Theory, vol. 57, no. 6, pp. 3763–3782, Jun. 2011
work page 2011
-
[42]
Sequential source coding for stochastic systems subject to finite rate constraints,
P. A. Stavrou, M. Skoglund, and T. Tanaka, “Sequential source coding for stochastic systems subject to finite rate constraints,”IEEE Transac- tions on Automatic Control, vol. 67, no. 8, pp. 3822–3835, Aug. 2022
work page 2022
-
[43]
Source coding with feed- forward: Rate-distortion theorems and error exponents for a general source,
R. Venkataramanan and S. S. Pradhan, “Source coding with feed- forward: Rate-distortion theorems and error exponents for a general source,”IEEE Transactions on Information Theory, vol. 53, no. 6, pp. 2154–2179, Jun. 2007
work page 2007
-
[44]
Computable bounds for rate distortion with feed forward for stationary and ergodic sources,
I. Naiss and H. H. Permuter, “Computable bounds for rate distortion with feed forward for stationary and ergodic sources,”IEEE Transactions on Information Theory, vol. 59, no. 2, pp. 760–781, Feb. 2013
work page 2013
-
[45]
Source coding with a side information “vending machine
H. H. Permuter and T. Weissman, “Source coding with a side information “vending machine”,”IEEE Transactions on Information Theory, vol. 57, no. 7, pp. 4530–4544, Jul. 2011
work page 2011
-
[46]
Multiterminal source coding with action-dependent side information,
Y . Chia, H. Asnani, and T. Weissman, “Multiterminal source coding with action-dependent side information,”IEEE Transactions on Information Theory, vol. 59, no. 6, pp. 3653–3667, Jun. 2013
work page 2013
-
[47]
Distributed and cascade lossy source coding with a side information “vending machine
B. Ahmadi and O. Simeone, “Distributed and cascade lossy source coding with a side information “vending machine”,”IEEE Transactions on Information Theory, vol. 59, no. 10, pp. 6807–6819, Oct. 2013
work page 2013
-
[48]
Blahut- Arimoto algorithm and code design for action-dependent source coding problems,
K. F. Trillingsgaard, O. Simeone, P. Popovski, and T. Larsen, “Blahut- Arimoto algorithm and code design for action-dependent source coding problems,” inProceedings of the IEEE International Symposium on Information Theory (ISIT), Jul. 2013, pp. 1192–1196
work page 2013
-
[49]
Evaluation of an achievable rate region for the broadcast channel,
B. E. Hajek and M. B. Pursley, “Evaluation of an achievable rate region for the broadcast channel,”IEEE Transactions on Information Theory, vol. 25, no. 1, pp. 36–46, Jan. 1979
work page 1979
-
[50]
The discrete memoryless multiple-access channel with cribbing encoders,
F. M. J. Willems and E. C. van der Meulen, “The discrete memoryless multiple-access channel with cribbing encoders,”IEEE Transactions on Information Theory, vol. 31, no. 3, pp. 313–327, May 1985
work page 1985
-
[51]
Nonasymptotic oblivious relaying and variable-length noisy lossy source coding,
Y . Liu, S. H. Advary, and C. T. Li, “Nonasymptotic oblivious relaying and variable-length noisy lossy source coding,” inProceedings of the IEEE International Symposium on Information Theory (ISIT), Jun. 2025, pp. 1859–1864
work page 2025
-
[52]
One-shot coding over general noisy networks,
Y . Liu and C. T. Li, “One-shot coding over general noisy networks,” IEEE Transactions on Information Theory, vol. 71, no. 11, pp. 8346– 8357, Nov. 2025
work page 2025
-
[53]
Rate of prefix-free codes in LQG control systems,
T. Tanaka, K. H. Johansson, T. Oechtering, H. Sandberg, and M. Skoglund, “Rate of prefix-free codes in LQG control systems,”arXiv preprint arXiv:1604.01227, 2016
discussion (0)
Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.