Optimal Privacy-Aware Co-Design of Quantizer and Controller in Networked Control Systems

Chuanghong Weng; Ehsan Nekouei

arxiv: 2604.08860 · v1 · submitted 2026-04-10 · 📡 eess.SY · cs.SY

Optimal Privacy-Aware Co-Design of Quantizer and Controller in Networked Control Systems

Chuanghong Weng , Ehsan Nekouei This is my paper

Pith reviewed 2026-05-10 18:07 UTC · model grok-4.3

classification 📡 eess.SY cs.SY

keywords privacy-aware controlquantized controlmutual information regularizationnetworked control systemsstochastic controldynamic programmingpolicy gradient optimization

0 comments

The pith

Optimal quantizer and controller designs for networked control systems emerge from solving a stochastic control problem regularized by mutual information to limit privacy leakage.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

This paper shows how to jointly design a quantizer and a controller for a dynamical system sending quantized measurements to a remote controller, while limiting what an adversary can learn about private inputs from the quantized signals and control actions. The approach casts the problem as a stochastic control task where mutual information between private inputs and the outputs serves as a penalty term. A sympathetic reader would care because many real systems, such as building controls or sensor networks, must balance performance against the risk of information leakage through communication channels. The authors derive coupled Bellman equations via dynamic programming, prove that the optimal controller is deterministic, and show the quantizer adjusts beliefs in a closed loop. They then provide a practical parameterization and gradient-based optimization method validated on a building control example.

Core claim

The optimal privacy-aware quantizer and controller are obtained by solving a stochastic control problem with mutual information regularization, where the mutual information measures the privacy leakage through the quantizer and controller. The authors first derive the coupled Bellman equations for the optimal quantizer and controller using the dynamic programming decomposition method. They then analyze the structural properties of the solution, showing that the optimal controller is deterministic, while the optimal quantizer regulates the adversary's belief in a closed-loop manner to enhance privacy. To enable numerical optimization, the quantizer and controller are jointly parameterized and

What carries the argument

A stochastic control problem regularized by the mutual information between the private input process and the pair of quantization outputs plus control outputs, solved via coupled Bellman equations from dynamic programming decomposition.

If this is right

The optimal controller takes a deterministic form that minimizes a combined cost including control performance and privacy penalty.
The quantizer operates by regulating the adversary's posterior belief about the private input in a closed-loop fashion.
Joint parameterization of quantizer and controller allows policy gradient methods to find numerical solutions.
A binary classification approximation can estimate the mutual information privacy leakage term.
The designs improve privacy in applications such as building control systems without excessive performance loss.

Where Pith is reading between the lines

These are editorial extensions of the paper, not claims the author makes directly.

Similar co-design methods could extend to systems where the adversary has partial or noisy observations of the control outputs.
If the mutual information measure underestimates leakage to sophisticated adversaries, the resulting designs may still provide partial protection but require additional safeguards.
The closed-loop belief regulation property of the quantizer suggests potential for adaptive privacy mechanisms in time-varying environments.

Load-bearing premise

Mutual information between the private inputs and the observed quantization and control outputs fully captures the privacy leakage that a realistic adversary could achieve, and the designer knows the exact system dynamics, noise distributions, and adversary observation model.

What would settle it

A simulation or experiment in which an adversary using inference methods beyond those captured by mutual information extracts substantially more information about the private inputs than the regularized design predicts, while the system performance remains as expected.

Figures

Figures reproduced from arXiv: 2604.08860 by Chuanghong Weng, Ehsan Nekouei.

**Figure 2.** Figure 2: Trajectories of the deviation of CO2 concentration and measurements (a), the occupancy trajectory (b), and the misdetection instances of the occupancy estimator (c). be exploited for undesirable purposes, including targeted advertising, surveillance, or even burglary, which motivates the need for a privacy-aware networked control design. B. Optimal Privacy-Aware Quantizer-Controller Co-Design We consider t… view at source ↗

**Figure 3.** Figure 3: Structures of networked control systems: (a) privacy-aware design in Theorem 1; (b) privacy-aware design in [PITH_FULL_IMAGE:figures/full_fig_p008_3.png] view at source ↗

**Figure 4.** Figure 4: The average stage cost under the optimal privacy-aware quantizer-controller co-design. [PITH_FULL_IMAGE:figures/full_fig_p009_4.png] view at source ↗

**Figure 5.** Figure 5: increases, leading to poorer regulation performance. At the same time, [PITH_FULL_IMAGE:figures/full_fig_p010_5.png] view at source ↗

**Figure 6.** Figure 6: Trajectories of quantization indices. 0 5 10 15 20 25 30 35 40 45 50 0 0.5 1 0 5 10 15 20 25 30 35 40 45 50 0 0.5 1 0 5 10 15 20 25 30 35 40 45 50 0 0.5 1 [PITH_FULL_IMAGE:figures/full_fig_p010_6.png] view at source ↗

**Figure 7.** Figure 7: Misdetections of the occupancy estimator. [PITH_FULL_IMAGE:figures/full_fig_p010_7.png] view at source ↗

read the original abstract

This paper investigates the optimal privacy-aware networked control problem, in which the dynamical system affected by a private input process sends its measurement to a remote controller after stochastic quantization. An adversary seeks to infer private system inputs from quantization results and control outputs. The optimal privacy-aware quantizer and controller are obtained by solving a stochastic control problem with mutual information regularization, where the mutual information measures the privacy leakage through the quantizer and controller. We first derive the coupled Bellman equations for the optimal quantizer and controller using the dynamic programming decomposition method. We then analyze the structural properties of the solution, showing that the optimal controller is deterministic, while the optimal quantizer regulates the adversary's belief in a closed-loop manner to enhance privacy. To enable numerical optimization, the quantizer and controller are jointly parameterized and then updated via policy gradient methods, and a binary classification approach is used to approximate privacy leakage. Finally, we validate the effectiveness of the proposed approach through numerical experiments on a building control system.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Desk Editor's Note private letter to a colleague

The paper gives a DP formulation for MI-regularized quantizer-controller co-design with some structural results, but the binary-classifier approximation for privacy leakage lacks validation and weakens the numerical claims.

read the letter

The main point is a joint optimization of stochastic quantizer and controller to balance control performance against privacy leakage measured by mutual information between private inputs and the observed outputs. They decompose the problem with dynamic programming to get coupled Bellman equations, then prove that the optimal controller is deterministic and the quantizer works in closed loop to shape the adversary's belief state. That structural analysis is the cleanest part of the work and is worth noting. The numerical side parameterizes both policies and uses policy gradients with a binary classifier to estimate the MI term on a building control example. The formulation itself is new enough in its specific combination of elements and does not collapse to prior results. The soft spot is the MI approximation. Mutual information is a global functional, so the classifier surrogate needs to track it reliably for the gradients to reach the DP solution, yet the paper supplies no error bounds, bias checks, or head-to-head tests against consistent estimators. That leaves the reported policies as solutions to a distorted objective rather than the stated one. The usual assumption that the designer knows the full system and adversary models is also there and limits how far the results travel. This is aimed at researchers working on secure networked control for IoT or infrastructure. It has enough formal grounding and a concrete application to deserve referee time, though the approximation step will need tightening.

Referee Report

2 major / 2 minor

Summary. This paper addresses the co-design of quantizers and controllers in networked control systems to achieve optimal performance while preserving privacy against adversaries inferring private inputs from quantized measurements and control signals. The approach formulates the problem as a stochastic control task with mutual information regularization, derives coupled Bellman equations via dynamic programming, proves that the optimal controller is deterministic and the quantizer regulates the adversary's belief in closed-loop, and implements a numerical solution by parameterizing the policies and using policy gradients with a binary classifier to estimate the privacy leakage term, demonstrated on a building control system.

Significance. Should the derivations of the Bellman equations and structural properties hold, and the numerical method accurately optimize the intended objective, this work offers a significant contribution to privacy-aware control design by providing both theoretical structure and a practical optimization framework for networked systems where privacy leakage is a concern.

major comments (2)

[Numerical optimization section] Section on numerical optimization (policy gradient with binary classification): The binary classification approach used to approximate the mutual information privacy leakage term lacks error bounds, bias analysis, or comparison to consistent estimators such as MINE. Because mutual information is a global non-additive functional of the joint trajectory distribution, this surrogate must faithfully track the true I(private inputs; quant+control outputs) for the gradient updates to converge to the solution of the coupled Bellman equations; without such validation the reported policies may optimize a distorted objective rather than the intended MI-regularized cost.
[Dynamic programming decomposition section] Dynamic programming decomposition section: The derivation of the coupled Bellman equations for the joint quantizer-controller optimization is asserted but the manuscript provides insufficient step-by-step details on the decomposition and the handling of the mutual information term within the value functions, making independent verification of the claimed structural properties (deterministic controller, closed-loop belief regulation) difficult.

minor comments (2)

[Abstract] Abstract: The description of the binary classification approach to approximate privacy leakage should briefly specify the classifier architecture, training procedure, and how it is embedded in the policy gradient estimator for clarity.
[Numerical experiments section] Numerical experiments section: The building control system example would benefit from explicit reporting of system matrices, noise statistics, adversary observation model, and the value of the mutual information regularization weight to support reproducibility of the numerical results.

Simulated Author's Rebuttal

2 responses · 0 unresolved

We thank the referee for the detailed and constructive feedback on our manuscript. We address each of the major comments below and outline the revisions we plan to make to strengthen the paper.

read point-by-point responses

Referee: [Numerical optimization section] Section on numerical optimization (policy gradient with binary classification): The binary classification approach used to approximate the mutual information privacy leakage term lacks error bounds, bias analysis, or comparison to consistent estimators such as MINE. Because mutual information is a global non-additive functional of the joint trajectory distribution, this surrogate must faithfully track the true I(private inputs; quant+control outputs) for the gradient updates to converge to the solution of the coupled Bellman equations; without such validation the reported policies may optimize a distorted objective rather than the intended MI-regularized cost.

Authors: We appreciate the referee's concern regarding the approximation of the mutual information term. The binary classification method is used to provide a differentiable estimate of the privacy leakage for policy gradient optimization. While the manuscript does not provide theoretical error bounds or a comparison to MINE, the approach is motivated by its computational efficiency and has been validated through the numerical experiments showing effective privacy-performance trade-offs. To address this, we will revise the numerical optimization section to include an analysis of the estimator's bias and variance, along with a comparison to MINE on the building control example. This will demonstrate that the surrogate sufficiently tracks the true mutual information for the purposes of optimization. revision: yes
Referee: [Dynamic programming decomposition section] Dynamic programming decomposition section: The derivation of the coupled Bellman equations for the joint quantizer-controller optimization is asserted but the manuscript provides insufficient step-by-step details on the decomposition and the handling of the mutual information term within the value functions, making independent verification of the claimed structural properties (deterministic controller, closed-loop belief regulation) difficult.

Authors: We agree that more detailed exposition of the dynamic programming decomposition would aid in verifying the results. The derivation involves applying the principle of optimality to the joint problem, separating the quantizer and controller decisions, and incorporating the mutual information as an additive term in the cost that depends on the joint distribution. The structural properties are proven by showing that the optimal controller policy does not depend on randomization and that the quantizer can be chosen to minimize the information leakage given the closed-loop dynamics. In the revised manuscript, we will provide a more comprehensive step-by-step derivation of the coupled Bellman equations, explicitly detailing how the mutual information term is handled within the value functions, and include additional explanations for the structural properties. revision: yes

Circularity Check

0 steps flagged

No circularity; derivation applies standard DP to regularized objective with explicit numerical approximation

full rationale

The paper states it derives coupled Bellman equations for the MI-regularized stochastic control problem via dynamic programming decomposition, then analyzes structural properties (optimal controller deterministic, quantizer regulates belief in closed loop). Numerical solution is described separately as joint parameterization updated by policy gradient with binary classification to approximate the privacy leakage term. No quoted step reduces the claimed Bellman equations or optimality result to a fitted parameter or self-citation by construction; the approximation is presented transparently as an enabling technique rather than an exact derivation. The approach remains self-contained against the model assumptions and is checked via numerical experiments on a building control system, with no load-bearing self-referential reductions.

Axiom & Free-Parameter Ledger

1 free parameters · 1 axioms · 0 invented entities

The approach rests on standard stochastic control assumptions plus the modeling choice that mutual information quantifies privacy leakage; the regularization weight balancing performance and privacy is a free parameter chosen or tuned for the application.

free parameters (1)

mutual information regularization weight
The scalar weight multiplying the mutual information term in the objective is a tunable hyperparameter that trades off control cost against privacy leakage and must be selected for each instance.

axioms (1)

domain assumption Mutual information between private inputs and (quantized measurements, control outputs) is an appropriate and complete measure of privacy leakage to the adversary.
Invoked when the problem is cast as a regularized stochastic control task; this is a modeling choice rather than a derived result.

pith-pipeline@v0.9.0 · 5473 in / 1479 out tokens · 78977 ms · 2026-05-10T18:07:00.785889+00:00 · methodology

discussion (0)

Reference graph

Works this paper leans on

26 extracted references · 26 canonical work pages

[1]

Blind identification strategies for room occupancy estimation,

A. Ebadat, G. Bottegal, D. Varagnolo, B. Wahlberg, H. Hjalmarsson, and K. H. Johansson, “Blind identification strategies for room occupancy estimation,” in2015 European control conference (ECC), pp. 1315–1320, IEEE, 2015

work page 2015
[2]

A comprehensive review of approaches to building occupancy detection,

L. Rueda, K. Agbossou, A. Cardenas, N. Henao, and S. Kelouwani, “A comprehensive review of approaches to building occupancy detection,”Building and Environment, vol. 180, p. 106966, 2020. 11

work page 2020
[3]

Differential privacy: A survey of results,

C. Dwork, “Differential privacy: A survey of results,” inInternational conference on theory and applications of models of computation, pp. 1–19, Springer, 2008

work page 2008
[4]

An overview of information-theoretic security and privacy: Metrics, limits and applications,

M. Bloch, O. G ¨unl¨u, A. Yener, F. Oggier, H. V . Poor, L. Sankar, and R. F. Schaefer, “An overview of information-theoretic security and privacy: Metrics, limits and applications,”IEEE Journal on Selected Areas in Information Theory, vol. 2, no. 1, pp. 5–22, 2021

work page 2021
[5]

Differentially private filtering,

J. Le Ny and G. J. Pappas, “Differentially private filtering,”IEEE Transactions on Automatic Control, vol. 59, no. 2, pp. 341–354, 2013

work page 2013
[6]

On the cost of differential privacy in distributed control systems,

Z. Huang, Y . Wang, S. Mitra, and G. E. Dullerud, “On the cost of differential privacy in distributed control systems,” inProceedings of the 3rd international conference on High confidence networked systems, pp. 105–114, 2014

work page 2014
[7]

Differential privacy in linear distributed control systems: Entropy minimizing mechanisms and performance tradeoffs,

Y . Wang, Z. Huang, S. Mitra, and G. E. Dullerud, “Differential privacy in linear distributed control systems: Entropy minimizing mechanisms and performance tradeoffs,”IEEE Transactions on Control of Network Systems, vol. 4, no. 1, pp. 118–130, 2017

work page 2017
[8]

A differentially private method for distributed optimization in directed networks via state decomposition,

X. Chen, L. Huang, L. He, S. Dey, and L. Shi, “A differentially private method for distributed optimization in directed networks via state decomposition,” IEEE Transactions on Control of Network Systems, vol. 10, no. 4, pp. 2165–2177, 2023

work page 2023
[9]

Information-theoretic approaches to privacy in estimation and control,

E. Nekouei, T. Tanaka, M. Skoglund, and K. H. Johansson, “Information-theoretic approaches to privacy in estimation and control,”Annual Reviews in Control, vol. 47, pp. 412–422, 2019

work page 2019
[10]

Directed information and privacy loss in cloud-based control,

T. Tanaka, M. Skoglund, H. Sandberg, and K. H. Johansson, “Directed information and privacy loss in cloud-based control,” in2017 American control conference (ACC), pp. 1666–1672, IEEE, 2017

work page 2017
[11]

Information-theoretic privacy for smart metering systems with a rechargeable battery,

S. Li, A. Khisti, and A. Mahajan, “Information-theoretic privacy for smart metering systems with a rechargeable battery,”IEEE Transactions on Information Theory, vol. 64, no. 5, pp. 3679–3695, 2018

work page 2018
[12]

Smoother entropy for active state trajectory estimation and obfuscation in pomdps,

T. L. Molloy and G. N. Nair, “Smoother entropy for active state trajectory estimation and obfuscation in pomdps,”IEEE Transactions on Automatic Control, vol. 68, no. 6, pp. 3557–3572, 2023

work page 2023
[13]

A pomdp extension with belief-dependent rewards,

M. Araya, O. Buffet, V . Thomas, and F. Charpillet, “A pomdp extension with belief-dependent rewards,”Advances in neural information processing systems, vol. 23, pp. 64–72, 2010

work page 2010
[14]

Privacy-preserving aggregate mobility data release: An information-theoretic deep reinforcement learning approach,

W. Zhang, B. Jiang, M. Li, and X. Lin, “Privacy-preserving aggregate mobility data release: An information-theoretic deep reinforcement learning approach,”IEEE Transactions on Information Forensics and Security, vol. 17, pp. 849–864, 2022

work page 2022
[15]

On the foundations of quantitative information flow,

G. Smith, “On the foundations of quantitative information flow,” inInternational Conference on Foundations of Software Science and Computational Structures, pp. 288–302, Springer, 2009

work page 2009
[16]

An operational approach to information leakage,

I. Issa, A. B. Wagner, and S. Kamath, “An operational approach to information leakage,”IEEE Transactions on Information Theory, vol. 66, no. 3, pp. 1625–1657, 2019

work page 2019
[17]

Pointwise maximal leakage,

S. Saeidian, G. Cervia, T. J. Oechtering, and M. Skoglund, “Pointwise maximal leakage,”IEEE Transactions on Information Theory, 2023

work page 2023
[18]

Bounds on inference,

F. P. Calmon, M. Varia, M. M ´edard, M. M. Christiansen, K. R. Duffy, and S. Tessaro, “Bounds on inference,” in2013 51st Annual Allerton Conference on Communication, Control, and Computing (Allerton), pp. 567–574, IEEE, 2013

work page 2013
[19]

T. M. Cover,Elements of information theory. John Wiley & Sons, 1999

work page 1999
[20]

Decentralized stochastic control with partial history sharing: A common information approach,

A. Nayyar, A. Mahajan, and D. Teneketzis, “Decentralized stochastic control with partial history sharing: A common information approach,”IEEE Transactions on Automatic Control, vol. 58, no. 7, pp. 1644–1658, 2013

work page 2013
[21]

Reinforcement learning for jointly optimal coding and control policies for a controlled markovian system over a communication channel,

E. Hubbard, L. Cregg, and S. Y ¨uksel, “Reinforcement learning for jointly optimal coding and control policies for a controlled markovian system over a communication channel,”IEEE Transactions on Automatic Control, pp. 1–16, 2026

work page 2026
[22]

Krishnamurthy,Partially observed Markov decision processes

V . Krishnamurthy,Partially observed Markov decision processes. Cambridge university press, 2016

work page 2016
[23]

Thomas and A

M. Thomas and A. T. Joy,Elements of information theory. Wiley-Interscience, 2006

work page 2006
[24]

Reinforcement learning and markov decision processes,

M. Otterlo and M. Wiering, “Reinforcement learning and markov decision processes,”Reinforcement Learning: State of the Art, pp. 3–42, 01 2012

work page 2012
[25]

Neural estimators for conditional mutual information using nearest neighbors sampling,

S. Molavipour, G. Bassi, and M. Skoglund, “Neural estimators for conditional mutual information using nearest neighbors sampling,”IEEE transactions on signal processing, vol. 69, pp. 766–780, 2021

work page 2021
[26]

Discrete variational autoencoders,

J. T. Rolfe, “Discrete variational autoencoders,” inInternational Conference on Learning Representations, 2017. APPENDIXA PROOF OFLEMMA1 To prove Lemma 1, we first show that the mutual information can be expanded into an additive form. Lemma 4.The mutual information term in(7)can be expanded as I ST , U T ;Y T = TX t=1 I St;Y t−1 St−1, U t−1 = TX t=1 E ci...

work page 2017

[1] [1]

Blind identification strategies for room occupancy estimation,

A. Ebadat, G. Bottegal, D. Varagnolo, B. Wahlberg, H. Hjalmarsson, and K. H. Johansson, “Blind identification strategies for room occupancy estimation,” in2015 European control conference (ECC), pp. 1315–1320, IEEE, 2015

work page 2015

[2] [2]

A comprehensive review of approaches to building occupancy detection,

L. Rueda, K. Agbossou, A. Cardenas, N. Henao, and S. Kelouwani, “A comprehensive review of approaches to building occupancy detection,”Building and Environment, vol. 180, p. 106966, 2020. 11

work page 2020

[3] [3]

Differential privacy: A survey of results,

C. Dwork, “Differential privacy: A survey of results,” inInternational conference on theory and applications of models of computation, pp. 1–19, Springer, 2008

work page 2008

[4] [4]

An overview of information-theoretic security and privacy: Metrics, limits and applications,

M. Bloch, O. G ¨unl¨u, A. Yener, F. Oggier, H. V . Poor, L. Sankar, and R. F. Schaefer, “An overview of information-theoretic security and privacy: Metrics, limits and applications,”IEEE Journal on Selected Areas in Information Theory, vol. 2, no. 1, pp. 5–22, 2021

work page 2021

[5] [5]

Differentially private filtering,

J. Le Ny and G. J. Pappas, “Differentially private filtering,”IEEE Transactions on Automatic Control, vol. 59, no. 2, pp. 341–354, 2013

work page 2013

[6] [6]

On the cost of differential privacy in distributed control systems,

Z. Huang, Y . Wang, S. Mitra, and G. E. Dullerud, “On the cost of differential privacy in distributed control systems,” inProceedings of the 3rd international conference on High confidence networked systems, pp. 105–114, 2014

work page 2014

[7] [7]

Differential privacy in linear distributed control systems: Entropy minimizing mechanisms and performance tradeoffs,

Y . Wang, Z. Huang, S. Mitra, and G. E. Dullerud, “Differential privacy in linear distributed control systems: Entropy minimizing mechanisms and performance tradeoffs,”IEEE Transactions on Control of Network Systems, vol. 4, no. 1, pp. 118–130, 2017

work page 2017

[8] [8]

A differentially private method for distributed optimization in directed networks via state decomposition,

X. Chen, L. Huang, L. He, S. Dey, and L. Shi, “A differentially private method for distributed optimization in directed networks via state decomposition,” IEEE Transactions on Control of Network Systems, vol. 10, no. 4, pp. 2165–2177, 2023

work page 2023

[9] [9]

Information-theoretic approaches to privacy in estimation and control,

E. Nekouei, T. Tanaka, M. Skoglund, and K. H. Johansson, “Information-theoretic approaches to privacy in estimation and control,”Annual Reviews in Control, vol. 47, pp. 412–422, 2019

work page 2019

[10] [10]

Directed information and privacy loss in cloud-based control,

T. Tanaka, M. Skoglund, H. Sandberg, and K. H. Johansson, “Directed information and privacy loss in cloud-based control,” in2017 American control conference (ACC), pp. 1666–1672, IEEE, 2017

work page 2017

[11] [11]

Information-theoretic privacy for smart metering systems with a rechargeable battery,

S. Li, A. Khisti, and A. Mahajan, “Information-theoretic privacy for smart metering systems with a rechargeable battery,”IEEE Transactions on Information Theory, vol. 64, no. 5, pp. 3679–3695, 2018

work page 2018

[12] [12]

Smoother entropy for active state trajectory estimation and obfuscation in pomdps,

T. L. Molloy and G. N. Nair, “Smoother entropy for active state trajectory estimation and obfuscation in pomdps,”IEEE Transactions on Automatic Control, vol. 68, no. 6, pp. 3557–3572, 2023

work page 2023

[13] [13]

A pomdp extension with belief-dependent rewards,

M. Araya, O. Buffet, V . Thomas, and F. Charpillet, “A pomdp extension with belief-dependent rewards,”Advances in neural information processing systems, vol. 23, pp. 64–72, 2010

work page 2010

[14] [14]

Privacy-preserving aggregate mobility data release: An information-theoretic deep reinforcement learning approach,

W. Zhang, B. Jiang, M. Li, and X. Lin, “Privacy-preserving aggregate mobility data release: An information-theoretic deep reinforcement learning approach,”IEEE Transactions on Information Forensics and Security, vol. 17, pp. 849–864, 2022

work page 2022

[15] [15]

On the foundations of quantitative information flow,

G. Smith, “On the foundations of quantitative information flow,” inInternational Conference on Foundations of Software Science and Computational Structures, pp. 288–302, Springer, 2009

work page 2009

[16] [16]

An operational approach to information leakage,

I. Issa, A. B. Wagner, and S. Kamath, “An operational approach to information leakage,”IEEE Transactions on Information Theory, vol. 66, no. 3, pp. 1625–1657, 2019

work page 2019

[17] [17]

Pointwise maximal leakage,

S. Saeidian, G. Cervia, T. J. Oechtering, and M. Skoglund, “Pointwise maximal leakage,”IEEE Transactions on Information Theory, 2023

work page 2023

[18] [18]

Bounds on inference,

F. P. Calmon, M. Varia, M. M ´edard, M. M. Christiansen, K. R. Duffy, and S. Tessaro, “Bounds on inference,” in2013 51st Annual Allerton Conference on Communication, Control, and Computing (Allerton), pp. 567–574, IEEE, 2013

work page 2013

[19] [19]

T. M. Cover,Elements of information theory. John Wiley & Sons, 1999

work page 1999

[20] [20]

Decentralized stochastic control with partial history sharing: A common information approach,

A. Nayyar, A. Mahajan, and D. Teneketzis, “Decentralized stochastic control with partial history sharing: A common information approach,”IEEE Transactions on Automatic Control, vol. 58, no. 7, pp. 1644–1658, 2013

work page 2013

[21] [21]

Reinforcement learning for jointly optimal coding and control policies for a controlled markovian system over a communication channel,

E. Hubbard, L. Cregg, and S. Y ¨uksel, “Reinforcement learning for jointly optimal coding and control policies for a controlled markovian system over a communication channel,”IEEE Transactions on Automatic Control, pp. 1–16, 2026

work page 2026

[22] [22]

Krishnamurthy,Partially observed Markov decision processes

V . Krishnamurthy,Partially observed Markov decision processes. Cambridge university press, 2016

work page 2016

[23] [23]

Thomas and A

M. Thomas and A. T. Joy,Elements of information theory. Wiley-Interscience, 2006

work page 2006

[24] [24]

Reinforcement learning and markov decision processes,

M. Otterlo and M. Wiering, “Reinforcement learning and markov decision processes,”Reinforcement Learning: State of the Art, pp. 3–42, 01 2012

work page 2012

[25] [25]

Neural estimators for conditional mutual information using nearest neighbors sampling,

S. Molavipour, G. Bassi, and M. Skoglund, “Neural estimators for conditional mutual information using nearest neighbors sampling,”IEEE transactions on signal processing, vol. 69, pp. 766–780, 2021

work page 2021

[26] [26]

Discrete variational autoencoders,

J. T. Rolfe, “Discrete variational autoencoders,” inInternational Conference on Learning Representations, 2017. APPENDIXA PROOF OFLEMMA1 To prove Lemma 1, we first show that the mutual information can be expanded into an additive form. Lemma 4.The mutual information term in(7)can be expanded as I ST , U T ;Y T = TX t=1 I St;Y t−1 St−1, U t−1 = TX t=1 E ci...

work page 2017