Perfect Privacy and Strong Stationary Times for Markovian Sources
Pith reviewed 2026-05-16 11:47 UTC · model grok-4.3
The pith
Erasing data up to a strong stationary time achieves perfect privacy for Markov chain initial states while redacting only a constant average number of points independent of sequence length.
A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.
Core claim
For data generated by a finite time-homogeneous Markov chain, erasing observations up to a strong stationary time preserves perfect privacy of the initial state under suitable conditions on the chain. An optimal sequential redaction mechanism admits an equivalent window interpretation, and both schemes achieve the optimal expected Hamming distortion by redacting only a constant average number of data points independent of the data length N.
What carries the argument
Strong stationary time: the first time at which the Markov chain reaches its stationary distribution exactly, used as the stopping point for a window-based redaction prefix that severs dependence on the initial state.
If this is right
- Perfect privacy is achievable with utility loss that does not grow with sequence length N.
- Sequential redaction policies can be realized equivalently as fixed-window erasures tied to stationary times.
- For long Markov sequences, nearly the entire data set can be released while satisfying the privacy constraint.
- The average number of redactions remains bounded by a constant determined only by the chain, not by N.
Where Pith is reading between the lines
- The approach may extend to streaming or infinite-length Markov processes where only a finite average prefix needs redaction.
- Explicit computation of the average redaction count could follow from known bounds on mixing or hitting times of the chain.
- Similar window-based redaction ideas might apply to other chain-dependent privacy problems such as protecting a hidden state at a fixed time.
- The equivalence between sequential and window mechanisms suggests that online privacy enforcement can be made offline-equivalent for this setting.
Load-bearing premise
Erasing data up to a strong stationary time is sufficient to preserve privacy of the initial state for the finite time-homogeneous Markov chain under consideration.
What would settle it
A concrete counterexample consisting of a small finite Markov chain in which an observer can still infer the initial state with positive probability from the data released after redaction up to the strong stationary time would disprove the privacy claim.
read the original abstract
We consider the problem of sharing correlated data under a perfect information-theoretic privacy constraint. We focus on redaction (erasure) mechanisms, in which data are either withheld or released unchanged, and measure utility by the average cardinality of the released set, equivalently, the expected Hamming distortion. Assuming the data are generated by a finite time-homogeneous Markov chain, we study the protection of the initial state while maximizing the amount of shared data. We establish a connection between perfect privacy and window-based redaction schemes, showing that erasing data up to a strong stationary time preserves privacy under suitable conditions. We further study an optimal sequential redaction mechanism and prove that it admits an equivalent window interpretation. Interestingly, we show that both mechanisms achieve the optimal distortion while redacting only a constant average number of data points, independent of the data length~$N$.
Editorial analysis
A structured set of objections, weighed in public.
Referee Report
Summary. The paper studies redaction mechanisms for releasing data generated by a finite time-homogeneous Markov chain while protecting the initial state under perfect information-theoretic privacy. It connects such privacy to strong stationary times, showing that erasure up to a strong stationary time preserves privacy under suitable conditions, establishes equivalence between an optimal sequential redaction mechanism and a window-based scheme, and claims both achieve optimal expected Hamming distortion by redacting only a constant average number of symbols independent of sequence length N.
Significance. If the central claims on privacy preservation and constant-redaction optimality hold, the results would establish a precise link between strong stationary times and perfect privacy for Markov sources, showing that utility loss need not grow with N. This would be of interest in information-theoretic privacy and data release for dependent sources.
major comments (2)
- [§3] §3 (privacy via strong stationary times): the argument that the released suffix after τ is independent of X_0 does not automatically extend to the marginal distribution of the observed redaction length, since the law of τ depends on the initial distribution μ; the skeptic concern that P(observed length) can leak information about X_0 therefore requires an explicit counter-argument or additional conditioning in the mechanism definition.
- [§5] §5 (optimality and equivalence): the proof that both mechanisms achieve the same optimal distortion with E[redactions] constant in N must specify the precise mixing or aperiodicity conditions on the chain under which the expectation remains bounded independently of N and of the starting distribution; without this, the independence-of-N claim is not fully load-bearing.
minor comments (2)
- Notation for the strong stationary time τ and the released suffix should be introduced with an explicit equation showing how the observed length is part of the released object.
- [Abstract] The abstract states 'under suitable conditions' for privacy preservation; these conditions should be stated explicitly already in the introduction rather than deferred to the technical sections.
Simulated Author's Rebuttal
We are grateful to the referee for the thorough review and constructive comments on our manuscript. We address each major comment below and will incorporate revisions to clarify the privacy argument and specify the required conditions on the Markov chain.
read point-by-point responses
-
Referee: [§3] §3 (privacy via strong stationary times): the argument that the released suffix after τ is independent of X_0 does not automatically extend to the marginal distribution of the observed redaction length, since the law of τ depends on the initial distribution μ; the skeptic concern that P(observed length) can leak information about X_0 therefore requires an explicit counter-argument or additional conditioning in the mechanism definition.
Authors: We thank the referee for identifying this subtlety in the privacy argument. While the manuscript establishes that the suffix following a strong stationary time τ is independent of the initial state X_0 under the Markov property, we acknowledge that the dependence of the distribution of τ on the initial distribution μ requires careful handling to ensure the entire observed redacted sequence (including the length) is independent of X_0. We will revise §3 to provide an explicit proof of this independence by incorporating an additional randomization in the stopping rule that renders the distribution of the observed length independent of X_0, while maintaining the strong stationarity of the released suffix. This revision will include the necessary conditioning and a detailed verification that the mutual information I(X_0; observed data) = 0. revision: yes
-
Referee: [§5] §5 (optimality and equivalence): the proof that both mechanisms achieve the same optimal distortion with E[redactions] constant in N must specify the precise mixing or aperiodicity conditions on the chain under which the expectation remains bounded independently of N and of the starting distribution; without this, the independence-of-N claim is not fully load-bearing.
Authors: We appreciate the referee's request for precision on the conditions. The manuscript assumes a finite time-homogeneous Markov chain, but to make the constant-in-N claim rigorous, we will revise §5 to explicitly state that the chain is irreducible and aperiodic. Under these standard conditions, there exists a strong stationary time τ with finite expectation E[τ] < ∞ that is independent of the sequence length N and bounded uniformly over starting distributions (due to finiteness of the state space). Consequently, the expected number of redactions remains bounded by E[τ] independently of N. We will add this clarification and a brief reference to the relevant mixing time bounds to support the optimality result. revision: yes
Circularity Check
No circularity; derivation relies on standard Markov properties
full rationale
The paper connects perfect privacy to redaction up to a strong stationary time using the definition that the chain reaches stationarity at τ, making the suffix independent of the initial state conditional on τ. The optimal distortion claim and constant average redactions follow from bounding E[τ] for finite-state time-homogeneous Markov chains, which is independent of horizon N by standard renewal theory. No equation reduces a claimed prediction to a fitted parameter by construction, no load-bearing self-citation is invoked for a uniqueness theorem, and no ansatz is smuggled via prior work. The derivation is self-contained against external Markov chain benchmarks.
Axiom & Free-Parameter Ledger
axioms (1)
- domain assumption Data are generated by a finite time-homogeneous Markov chain
Reference graph
Works this paper leans on
-
[1]
D. A. Levin and Y . Peres,Markov chains and mixing times. American Mathematical Soc., 2017, vol. 107
work page 2017
-
[2]
Shuffling cards and stopping times,
D. Aldous and P. Diaconis, “Shuffling cards and stopping times,”The American Mathematical Monthly, vol. 93, no. 5, pp. 333–348, 1986
work page 1986
-
[3]
Pufferfish privacy mechanisms for correlated data,
S. Song, Y . Wang, and K. Chaudhuri, “Pufferfish privacy mechanisms for correlated data,” inProceedings of the 2017 ACM International Conference on Management of Data, 2017, pp. 1291–1306
work page 2017
-
[4]
No free lunch in data privacy,
D. Kifer and A. Machanavajjhala, “No free lunch in data privacy,” in Proceedings of the 2011 ACM SIGMOD International Conference on Management of data, 2011, pp. 193–204
work page 2011
-
[5]
Quantifying differential privacy under temporal correlations,
Y . Cao, M. Yoshikawa, Y . Xiao, and L. Xiong, “Quantifying differential privacy under temporal correlations,” in2017 IEEE 33rd International Conference on Data Engineering (ICDE). IEEE, 2017, pp. 821–832
work page 2017
-
[6]
Blowfish privacy: Tuning privacy-utility trade-offs using policies,
X. He, A. Machanavajjhala, and B. Ding, “Blowfish privacy: Tuning privacy-utility trade-offs using policies,” inProceedings of the 2014 ACM SIGMOD international conference on Management of data, 2014, pp. 1447–1458
work page 2014
-
[7]
Correlated differential privacy: Hiding information in non-iid data set,
T. Zhu, P. Xiong, G. Li, and W. Zhou, “Correlated differential privacy: Hiding information in non-iid data set,”IEEE Transactions on Informa- tion Forensics and Security, vol. 10, no. 2, pp. 229–242, 2014
work page 2014
-
[8]
Correlated network data publication via differential privacy,
R. Chen, B. C. Fung, P. S. Yu, and B. C. Desai, “Correlated network data publication via differential privacy,”The VLDB Journal, vol. 23, pp. 653–676, 2014
work page 2014
-
[9]
Bayesian differential privacy on correlated data,
B. Yang, I. Sato, and H. Nakagawa, “Bayesian differential privacy on correlated data,” inProceedings of the 2015 ACM SIGMOD interna- tional conference on Management of Data, 2015, pp. 747–762
work page 2015
-
[10]
Dependence makes you vulnber- able: Differential privacy under dependent tuples
C. Liu, S. Chakraborty, and P. Mittal, “Dependence makes you vulnber- able: Differential privacy under dependent tuples.” inNDSS, vol. 16, 2016, pp. 21–24
work page 2016
-
[11]
Protecting location privacy with personalized k- anonymity: Architecture and algorithms,
B. Gedik and L. Liu, “Protecting location privacy with personalized k- anonymity: Architecture and algorithms,”IEEE Transactions on Mobile Computing, vol. 7, no. 1, pp. 1–18, 2007
work page 2007
-
[12]
Protecting locations with differential privacy under temporal correlations,
Y . Xiao and L. Xiong, “Protecting locations with differential privacy under temporal correlations,” inProceedings of the 22nd ACM SIGSAC conference on computer and communications security, 2015, pp. 1298– 1309
work page 2015
-
[13]
Between close enough to reveal and far enough to protect: a new privacy region for correlated data,
L. Maßny, R. Bitar, F. Ye, and S. E. Rouayheb, “Between close enough to reveal and far enough to protect: a new privacy region for correlated data,”arXiv preprint arXiv:2501.14313, 2025
-
[14]
From the information bottleneck to the privacy funnel,
A. Makhdoumi, S. Salamatian, N. Fawaz, and M. Médard, “From the information bottleneck to the privacy funnel,” in2014 IEEE Information Theory Workshop (ITW 2014). IEEE, 2014, pp. 501–505
work page 2014
-
[15]
A compressive privacy approach to generalized informa- tion bottleneck and privacy funnel problems,
S.-Y . Kung, “A compressive privacy approach to generalized informa- tion bottleneck and privacy funnel problems,”Journal of the Franklin Institute, vol. 355, no. 4, pp. 1846–1872, 2018
work page 2018
-
[16]
The information bottleneck method
N. Tishby, F. C. Pereira, and W. Bialek, “The information bottleneck method,”arXiv preprint physics/0004057, 2000
work page internal anchor Pith review Pith/arXiv arXiv 2000
-
[17]
Fundamental limits of perfect privacy,
F. P. Calmon, A. Makhdoumi, and M. Médard, “Fundamental limits of perfect privacy,” in2015 IEEE International Symposium on Information Theory (ISIT). IEEE, 2015, pp. 1796–1800
work page 2015
-
[18]
New privacy mech- anism design with direct access to the private data,
A. Zamani, T. J. Oechtering, and M. Skoglund, “New privacy mech- anism design with direct access to the private data,”arXiv preprint arXiv:2309.09033, 2023
-
[19]
Statistical privacy mechanism design using separation technique,
——, “Statistical privacy mechanism design using separation technique,” in2024 32nd European Signal Processing Conference (EUSIPCO). IEEE, 2024, pp. 735–739
work page 2024
-
[20]
Secrecy by design with appli- cations to privacy and compression,
Y . Y . Shkel, R. S. Blum, and H. V . Poor, “Secrecy by design with appli- cations to privacy and compression,”IEEE Transactions on Information Theory, vol. 67, no. 2, pp. 824–843, 2020
work page 2020
-
[21]
Information-theoretic privacy-preserving schemes based on perfect privacy,
B. Rassouli and D. Gündüz, “Information-theoretic privacy-preserving schemes based on perfect privacy,”arXiv preprint arXiv:2301.11754, 2023
-
[22]
Mechanisms for hiding sensitive genotypes with information-theoretic privacy,
F. Ye, H. Cho, and S. El Rouayheb, “Mechanisms for hiding sensitive genotypes with information-theoretic privacy,”IEEE transactions on information theory, vol. 68, no. 6, pp. 4090–4105, 2022
work page 2022
-
[23]
On-off privacy with correlated requests,
C. Naim, F. Ye, and S. El Rouayheb, “On-off privacy with correlated requests,” in2019 IEEE International Symposium on Information Theory (ISIT). IEEE, 2019, pp. 817–821
work page 2019
-
[24]
Preserving on-off privacy for past and future requests,
F. Ye, C. Naim, and S. El Rouayheb, “Preserving on-off privacy for past and future requests,” in2019 IEEE Information Theory Workshop (ITW). IEEE, 2019, pp. 1–5
work page 2019
-
[25]
On-off privacy against correlation over time,
——, “On-off privacy against correlation over time,”IEEE Transactions on Information Forensics and Security, vol. 16, pp. 2104–2117, 2021
work page 2021
-
[26]
On-off privacy in the presence of correlation,
——, “On-off privacy in the presence of correlation,”IEEE Transactions on Information Theory, vol. 67, no. 11, pp. 7438–7457, 2021
work page 2021
-
[27]
Strong uniform times and finite random walks,
D. Aldous and P. Diaconis, “Strong uniform times and finite random walks,”Advances in applied mathematics, vol. 8, no. 1, pp. 69–97, 1987
work page 1987
-
[28]
Strong stationary times via a new form of duality,
P. Diaconis and J. A. Fill, “Strong stationary times via a new form of duality,”The Annals of Probability, pp. 1483–1522, 1990
work page 1990
-
[29]
Reversible markov chains and random walks on graphs,
D. Aldous and J. A. Fill, “Reversible markov chains and random walks on graphs,” 2002, unfinished monograph, recompiled 2014, available at http://www.stat.berkeley.edu/$\sim$aldous/RWG/book.html
work page 2002
-
[30]
Efficient stopping rules for markov chains,
L. Lovász and P. Winkler, “Efficient stopping rules for markov chains,” inProceedings of the twenty-seventh annual ACM symposium on Theory of computing, 1995, pp. 76–82
work page 1995
-
[31]
An interruptible algorithm for perfect sampling via markov chains,
J. A. Fill, “An interruptible algorithm for perfect sampling via markov chains,” inProceedings of the twenty-ninth annual ACM symposium on Theory of computing, 1997, pp. 688–695
work page 1997
-
[32]
Analysis of top to random shuffles,
P. Diaconis, J. A. Fill, and J. Pitman, “Analysis of top to random shuffles,”Combinatorics, probability and computing, vol. 1, no. 2, pp. 135–155, 1992
work page 1992
-
[33]
Group representations in probability and statistics,
P. Diaconis, “Group representations in probability and statistics,”Lecture notes-monograph series, vol. 11, pp. i–192, 1988
work page 1988
-
[34]
Extended version of perfect privacy and strong stationary times for markovian sources,
F. Ye, Z. Liu, P. Parag, and S. E. Rouayheb, “Extended version of perfect privacy and strong stationary times for markovian sources,” 2026, extended version available at https ://www.dropbox.com/scl/fo/l2jqya2oa2tolum8u5fyj/AJTs3asXrcWJqmx- 6z2-6TY
work page 2026
-
[35]
J. A. Fill, “Eigenvalue bounds on convergence to stationarity for non- reversible markov chains, with an application to the exclusion process,” The annals of applied probability, pp. 62–87, 1991. APPENDIXA PROOF OFRESULTS INSECTIONV-A Proof of Lemma 1.First, by the chain rule, we have I(X0;Y [1:N] ) =I(X 0;Y 1, ..., Yτ , Yτ+1 , ..., YN )(20) (a) =I(X 0;τ,...
work page 1991
-
[36]
For everyt≥1, the quantity min y∈X P t(x, y) is independent of the initial statex∈ X
-
[37]
Proof of Proposition 1.We first prove thatP t holdsP t(gx, gy) =P t(x, y)by induction ont
The transition matrixPis doubly stochastic, and hence the uniform distributionπ(y) = 1/|X |is a stationary distribution of the chain. Proof of Proposition 1.We first prove thatP t holdsP t(gx, gy) =P t(x, y)by induction ont. The caset= 1is the assumption. Assume it holds for somet≥1. Then for anyx, y, P t+1(gx, gy) = X z∈Ω P t(gx, z)P(z, gy). Use the bije...
discussion (0)
Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.