pith. sign in

arxiv: 2606.17743 · v2 · pith:NHSBJMXRnew · submitted 2026-06-16 · 💻 cs.IT · cs.SY· eess.SY· math.IT· math.OC

Information-Theoretic Meta Dynamic Programming for Signalling and Control of POMDPs

classification 💻 cs.IT cs.SYeess.SYmath.ITmath.OC
keywords controldynamicinformationinformation-theoreticpomdpsprogrammingsignallingdistribution
0
0 comments X
read the original abstract

In this paper, we study the information-theoretic characterization of simultaneous signalling and control over channels modeled by partially observable Markov decision processes (POMDPs). The problem is formulated as an optimization over randomized control strategies that maximize the directed information from actions to observations, subject to an average-cost constraint. We derive a novel dynamic programming framework in which the state is defined on the space of conditional probability distributions, leading to a high-level ``meta'' dynamic program. Specifically, we show that two coupled information states, namely, the posterior distribution of the system state and a distribution over such posteriors, satisfy Markov recursions and provide sufficient statistics for optimal control. This structure enables the decomposition of optimal strategies into separated randomized policies that depend only on these information states. Our results establish necessary and sufficient conditions for optimality and unify classical stochastic control and information-theoretic formulations. In particular, we show that in the absence of signalling, the proposed framework reduces to the standard dynamic programming equations for POMDPs. The developed approach provides a principled foundation for analyzing and designing control systems with intrinsic information constraints.

This paper has not been read by Pith yet.

discussion (0)

Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.