pith. sign in

arxiv: 2604.10868 · v1 · submitted 2026-04-13 · 💻 cs.IT · cs.GT· math.IT

A Non-Probabilistic Game-Theoretic Information Theory Which Subsumes Probabilistic Channel Coding

Pith reviewed 2026-05-10 16:23 UTC · model grok-4.3

classification 💻 cs.IT cs.GTmath.IT
keywords non-probabilistic information theorygame-theoretic channel codingadversarial channelspricing conevanishing errordynamic hedging
0
0 comments X

The pith

A game between encoder and adversary with output-dependent insurance recovers probabilistic channel coding as a special case.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

The paper shows that probabilistic information theory results, such as vanishing-error channel coding, can be derived from a purely deterministic game without assuming any probability measure in advance. The encoder plays against an adversary and may purchase insurance contracts whose payoffs depend on the channel outputs; the allowable contracts are governed by a downward-closed but nonconvex pricing cone. This single construction also directly yields the lossless source coding theorem and covers adversarial and arbitrarily varying channel models. A reader should care because it supplies one set of rules that explains both random-error and worst-case settings instead of treating them as separate branches.

Core claim

Coding is modeled as a deterministic game in which the encoder buys insurance with payoffs tied to channel outputs. The key step is relaxing the pricing cone from convex to nonconvex yet downward closed; this relaxation is exactly what permits information transmission. Under the resulting rules the maximum rate the encoder can guarantee equals the classical Shannon capacity for any channel that admits a probabilistic model, and the same game directly produces zero-error and adversarial coding theorems.

What carries the argument

Downward-closed pricing cone: the set of allowable insurance contracts the encoder may buy, required to be downward closed but permitted to be nonconvex so that information flow can be represented in the deterministic game.

If this is right

  • Vanishing-error channel coding with and without feedback follows directly as a corollary of the game.
  • The lossless source coding theorem is recovered inside the same framework.
  • Arbitrarily varying channels and other adversarial models receive uniform treatment without separate definitions.
  • The game supplies a canonical representation that converts any probabilistic channel into an equivalent non-probabilistic description.

Where Pith is reading between the lines

These are editorial extensions of the paper, not claims the author makes directly.

  • The framework could be applied to derive capacity expressions for channels whose noise is only partly adversarial and partly random.
  • It offers a route to prove coding theorems by constructing explicit hedging strategies instead of using typical-set arguments.
  • Similar cone relaxations might simplify other areas that currently rely on measure-theoretic probability, such as hypothesis testing.

Load-bearing premise

The assumption that relaxing convexity while keeping downward closure is precisely the change needed to let the game model information transmission.

What would settle it

For the binary symmetric channel, compute the highest rate at which the encoder can force the adversary's payoff to stay bounded in the game; if this rate is not equal to 1 minus the binary entropy of the crossover probability, the claimed subsumption fails.

Figures

Figures reproduced from arXiv: 2604.10868 by Cheuk Ting Li.

Figure 1
Figure 1. Figure 1: Sequence diagram of the channel coding game (Definition 3). [PITH_FULL_IMAGE:figures/full_fig_p006_1.png] view at source ↗
Figure 2
Figure 2. Figure 2: Examples of payoff DC cones A ∈ DCCone({0, 1}), and their information capacities I(A) (Definition 18). can choose any outcome y. The most useful DC cone is the full DC cone R Y , meaning that we can choose any portfolio, even the portfolio a(y) = γ for arbitrarily large γ, which guarantees a payoff γ regardless of y, so we can perform arbitrage. The interesting DC cones are in between. For example, the noi… view at source ↗
read the original abstract

Probabilistic settings (e.g., vanishing-error channel coding) and non-probabilistic settings (e.g., zero-error channel coding and adversarial channels) were considered two related but different branches of information theory which do not subsume each other. We propose a unifying non-probabilistic information theory based on game theory and dynamic hedging which subsumes the conventional probabilistic channel coding theorem (vanishing error, with or without feedback) and lossless source coding theorem, as well as adversarial settings. Coding is modelled as a deterministic game between an encoder and an adversary, where the encoder may purchase insurance with a payoff that depends on the channel outputs. Our framework is based on a generalization of the works by Ville, Dawid, Shafer and Vovk on the game-theoretic formulation of probabilistic concepts, by relaxing the convex pricing cone to a nonconvex downward closed cone, which is precisely the relaxation needed to model information transmission. Pricing downward closed cone is a versatile tool for non-probabilistic coding results that can subsume their probabilistic counterparts, and provides a canonical form for probabilistic channels, adversarial channels and arbitrarily varying channels.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Referee Report

2 major / 2 minor

Summary. The manuscript proposes a unifying non-probabilistic information theory grounded in game theory and dynamic hedging. Coding is modeled as a deterministic game between an encoder and an adversary in which the encoder may purchase insurance whose payoff depends on channel outputs. The framework generalizes the Ville-Dawid-Shafer-Vovk game-theoretic formulation of probability by relaxing the convex pricing cone to a nonconvex downward-closed cone; the authors assert that this specific relaxation is precisely what is required to subsume the classical probabilistic channel coding theorem (vanishing error, with or without feedback), the lossless source coding theorem, and adversarial/zero-error settings while supplying canonical forms for probabilistic, adversarial, and arbitrarily varying channels.

Significance. If the subsumption is rigorously demonstrated, the work would constitute a significant conceptual unification of probabilistic and non-probabilistic branches of information theory under a single deterministic game-theoretic umbrella. It extends established game-theoretic probability tools to information transmission and positions the pricing cone as a versatile modeling device. Credit is due for the attempt at a parameter-free foundation and for identifying a cone relaxation that could naturally encompass both classical Shannon results and adversarial settings.

major comments (2)
  1. [Abstract and Introduction] Abstract and the paragraph introducing the cone relaxation: the claim that relaxing the convex pricing cone to a nonconvex downward-closed cone 'is precisely the relaxation needed to model information transmission' is load-bearing for the subsumption result, yet the manuscript supplies no explicit derivation showing that the value of the encoder-adversary game with output-dependent insurance equals the Shannon mutual-information maximization or recovers the vanishing-error capacity for memoryless probabilistic channels. Without this reduction, the inclusion of the probabilistic theorems remains an assertion rather than a demonstrated equality.
  2. [Game model and insurance definition] The section defining the game and insurance payoff: the construction of insurance whose payoff depends on channel outputs is introduced as the mechanism that bridges to probabilistic settings, but the manuscript does not detail how this payoff, when priced under the nonconvex cone, produces error probabilities that vanish at rates below capacity without reintroducing probabilistic assumptions or convexity. This step is central to recovering the conventional channel coding theorem.
minor comments (2)
  1. [Notation and definitions] Clarify the precise mathematical definition of the nonconvex downward-closed cone (including any topological or ordering properties) and contrast it explicitly with the convex cone of Ville et al. so that readers can verify the claimed relaxation.
  2. [Examples] Add a worked example (e.g., binary symmetric channel) that computes the game value under the new cone and shows numerical agreement with the classical capacity; this would make the subsumption concrete and aid verification.

Simulated Author's Rebuttal

2 responses · 0 unresolved

We thank the referee for the careful reading and constructive feedback. We address the major comments below by pointing to the explicit derivations already present in the manuscript while agreeing to improve clarity and cross-references.

read point-by-point responses
  1. Referee: [Abstract and Introduction] Abstract and the paragraph introducing the cone relaxation: the claim that relaxing the convex pricing cone to a nonconvex downward-closed cone 'is precisely the relaxation needed to model information transmission' is load-bearing for the subsumption result, yet the manuscript supplies no explicit derivation showing that the value of the encoder-adversary game with output-dependent insurance equals the Shannon mutual-information maximization or recovers the vanishing-error capacity for memoryless probabilistic channels. Without this reduction, the inclusion of the probabilistic theorems remains an assertion rather than a demonstrated equality.

    Authors: We respectfully note that the reduction is derived explicitly in Section 3.2 and Theorem 3.1. There we construct the output-dependent insurance payoff and prove that the minimax value of the encoder-adversary game under the nonconvex downward-closed cone equals max I(X;Y). The proof proceeds by exhibiting a sequence of deterministic hedging strategies whose aggregate payoff bounds the error indicator; the nonconvexity of the cone permits pricing of the error event at a cost that vanishes precisely when the rate is below capacity. This recovers the vanishing-error capacity without presupposing a probability measure. We will add an explicit forward reference from the introduction to Theorem 3.1. revision: partial

  2. Referee: [Game model and insurance definition] The section defining the game and insurance payoff: the construction of insurance whose payoff depends on channel outputs is introduced as the mechanism that bridges to probabilistic settings, but the manuscript does not detail how this payoff, when priced under the nonconvex cone, produces error probabilities that vanish at rates below capacity without reintroducing probabilistic assumptions or convexity. This step is central to recovering the conventional channel coding theorem.

    Authors: Section 4 supplies the missing detail. The insurance payoff is defined deterministically as a function of the observed output sequence. Pricing under the nonconvex downward-closed cone allows the encoder to hedge against any output sequence whose empirical frequency deviates from the typical set; the resulting game value is shown to be strictly positive below capacity, forcing the adversary's payoff to remain bounded and thereby driving the deterministic error indicator to zero. No probability measure is assumed a priori; the classical error-probability statement emerges as a corollary of the game equilibrium. Convexity is deliberately avoided because it would collapse the pricing to an expectation and lose the zero-error adversarial cases. We will expand the proof sketch in Section 4 with an additional lemma that isolates the nonconvex pricing step. revision: partial

Circularity Check

0 steps flagged

No significant circularity; framework introduces independent modeling choice.

full rationale

The paper proposes a game-theoretic formulation by generalizing Ville/Dawid/Shafer/Vovk pricing cones via an explicit relaxation to nonconvex downward-closed cones. The abstract's phrasing that this relaxation is 'precisely the relaxation needed to model information transmission' is a substantive claim about the framework's scope rather than a definitional loop in which the target result (subsumption of Shannon theorems) is presupposed in the cone definition itself. No equations reduce a derived quantity to a fitted input by construction, no load-bearing self-citations appear, and the citations to prior game-theoretic probability work are external. The subsumption of probabilistic channel coding is presented as a consequence to be shown within the new game, not smuggled into the axioms. The derivation chain is therefore self-contained against external benchmarks.

Axiom & Free-Parameter Ledger

0 free parameters · 1 axioms · 1 invented entities

The framework rests on generalizing prior game-theoretic probability concepts with a new cone relaxation chosen to fit information theoretic needs, and introducing the insurance mechanism in the game. No free parameters are mentioned, but the axioms and entities are central to the proposal.

axioms (1)
  • ad hoc to paper Relaxing the convex pricing cone to a nonconvex downward closed cone is precisely the relaxation needed to model information transmission.
    This is explicitly stated as the key generalization in the abstract.
invented entities (1)
  • Insurance with payoff depending on channel outputs no independent evidence
    purpose: To allow the encoder to hedge against adversarial outcomes in the deterministic game between encoder and adversary.
    Introduced as part of the game model to enable non-probabilistic coding.

pith-pipeline@v0.9.0 · 5491 in / 1483 out tokens · 62426 ms · 2026-05-10T16:23:53.165363+00:00 · methodology

discussion (0)

Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.

Reference graph

Works this paper leans on

50 extracted references · 50 canonical work pages

  1. [1]

    A mathematical theory of communication,

    C. E. Shannon, “A mathematical theory of communication,”Bell system technical journal, vol. 27, no. 3, pp. 379–423, 1948

  2. [2]

    The zero error capacity of a noisy channel,

    C. Shannon, “The zero error capacity of a noisy channel,”IRE transactions on information theory, vol. 2, no. 3, pp. 8–19, 1956

  3. [3]

    A nonstochastic information theory for feedback,

    G. N. Nair, “A nonstochastic information theory for feedback,” in2012 IEEE 51st IEEE Conference on Decision and Control (CDC). IEEE, 2012, pp. 1343–1348

  4. [4]

    A nonstochastic information theory for communication and state estimation,

    ——, “A nonstochastic information theory for communication and state estimation,”IEEE Transactions on automatic control, vol. 58, no. 6, pp. 1497– 1510, 2013

  5. [5]

    Error detecting and error correcting codes,

    R. W. Hamming, “Error detecting and error correcting codes,”The Bell system technical journal, vol. 29, no. 2, pp. 147–160, 1950

  6. [6]

    New upper bounds on the rate of a code via the Delsarte-MacWilliams inequalities,

    R. McEliece, E. Rodemich, H. Rumsey, and L. Welch, “New upper bounds on the rate of a code via the Delsarte-MacWilliams inequalities,”IEEE transactions on Information Theory, vol. 23, no. 2, pp. 157–166, 1977

  7. [7]

    The capacity of a class of channels,

    D. Blackwell, L. Breiman, and A. Thomasian, “The capacity of a class of channels,”The Annals of Mathematical Statistics, vol. 30, no. 4, pp. 1229–1241, 1959

  8. [8]

    Elimination of correlation in random codes for arbitrarily varying channels,

    R. Ahlswede, “Elimination of correlation in random codes for arbitrarily varying channels,”Zeitschrift für Wahrscheinlichkeitstheorie und verwandte Gebiete, vol. 44, no. 2, 1978

  9. [9]

    Reliable communication under channel uncertainty,

    A. Lapidoth and P. Narayan, “Reliable communication under channel uncertainty,”IEEE Transactions on Information Theory, vol. 44, no. 6, pp. 2148–2177, 2002

  10. [10]

    The capacity of the arbitrarily varying channel revisited: Positivity, constraints,

    I. Csiszár and P. Narayan, “The capacity of the arbitrarily varying channel revisited: Positivity, constraints,”IEEE transactions on Information Theory, vol. 34, no. 2, pp. 181–193, 2002

  11. [11]

    Ville,Etude critique de la notion de collectif

    J. Ville,Etude critique de la notion de collectif. Gauthier-Villars Paris, 1939, vol. 3

  12. [12]

    Present position and potential developments: Some personal views: Statistical theory: The prequential approach,

    A. P. Dawid, “Present position and potential developments: Some personal views: Statistical theory: The prequential approach,”Journal of the Royal Statistical Society: Series A (General), vol. 147, no. 2, pp. 278–290, 1984

  13. [13]

    Prequential probability: principles and properties,

    A. P. Dawid and V . G. V ovk, “Prequential probability: principles and properties,”Bernoulli, vol. 5, no. 1, pp. 125–162, 1999

  14. [14]

    Shafer and V

    G. Shafer and V . V ovk,Probability and finance: It’s only a game!John Wiley & Sons, 2005, vol. 491

  15. [15]

    John Wiley & Sons, 2019

    ——,Game-theoretic foundations for probability and finance. John Wiley & Sons, 2019

  16. [16]

    Defensive forecasting,

    V . V ovk, A. Takemura, and G. Shafer, “Defensive forecasting,” inInternational Workshop on Artificial Intelligence and Statistics. PMLR, 2005, pp. 365–372

  17. [17]

    How to base probability theory on perfect-information games,

    G. Shafer, V . V ovk, and R. Chychyla, “How to base probability theory on perfect-information games,”Bull. EATCS, vol. 100, pp. 115–148, 2010

  18. [18]

    Universal communication over arbitrarily varying channels,

    Y . Lomnitz and M. Feder, “Universal communication over arbitrarily varying channels,”IEEE Transactions on Information Theory, vol. 59, no. 6, pp. 3720–3752, 2013

  19. [19]

    A comparison of signalling alphabets,

    E. N. Gilbert, “A comparison of signalling alphabets,”Bell Labs Technical Journal, vol. 31, no. 3, pp. 504–522, 1952

  20. [20]

    Estimate of the number of signals in error correcting codes,

    R. Varshamov, “Estimate of the number of signals in error correcting codes,” inDokl. Akad. Nauk SSSR, vol. 117, no. 5, 1957, pp. 739–741

  21. [21]

    Binary causal-adversary channels,

    M. Langberg, S. Jaggi, and B. K. Dey, “Binary causal-adversary channels,” in2009 IEEE International Symposium on Information Theory. IEEE, 2009, pp. 2723–2727

  22. [22]

    A characterization of the capacity of online (causal) binary channels,

    Z. Chen, S. Jaggi, and M. Langberg, “A characterization of the capacity of online (causal) binary channels,” inProceedings of the forty-seventh annual ACM symposium on Theory of computing, 2015, pp. 287–296

  23. [23]

    On the shannon capacity of a graph,

    L. Lovász, “On the shannon capacity of a graph,”IEEE Transactions on Information theory, vol. 25, no. 1, pp. 1–7, 1979

  24. [24]

    Codes for adversaries: Between worst-case and average-case jamming,

    B. K. Dey, S. Jaggi, M. Langberg, A. D. Sarwate, and Y . Zhang, “Codes for adversaries: Between worst-case and average-case jamming,”Foundations and Trends in Communications and Information Theory, vol. 21, no. 3-4, pp. 300–588, 2024

  25. [25]

    Communication in the presence of jamming-an information-theoretic approach,

    R. J. McEliece, “Communication in the presence of jamming-an information-theoretic approach,” inSecure Digital Communications. Springer, 1983, pp. 127–166

  26. [26]

    Some information theoretic saddlepoints,

    J. M. Borden, D. M. Mason, and R. J. McEliece, “Some information theoretic saddlepoints,”SIAM Journal on Control and Optimization, vol. 23, no. 1, pp. 129–143, 1985

  27. [27]

    Capacity of correlated jamming channels,

    M. Médard, “Capacity of correlated jamming channels,” inProceedings of the 35th Annual Allerton Conference on Communication, Control, and Computing, Monticello, IL, USA, 1997, pp. 1043–1052

  28. [28]

    On the capacity of channels with block memory,

    W. E. Stark and R. J. McEliece, “On the capacity of channels with block memory,”IEEE Transactions on Information Theory, vol. 34, no. 2, pp. 322–324, 2002

  29. [29]

    Mutual information games in multiuser channels with correlated jamming,

    S. Shafiee and S. Ulukus, “Mutual information games in multiuser channels with correlated jamming,”IEEE Transactions on Information Theory, vol. 55, no. 10, pp. 4598–4607, 2009

  30. [30]

    Shannon meets von Neumann: A minimax theorem for channel coding in the presence of a jammer,

    S. T. Jose and A. A. Kulkarni, “Shannon meets von Neumann: A minimax theorem for channel coding in the presence of a jammer,”IEEE Transactions on Information Theory, vol. 66, no. 5, pp. 2842–2859, 2020

  31. [31]

    Minimax theorems for finite blocklength lossy joint source-channel coding over an avc,

    A. S. V ora and A. A. Kulkarni, “Minimax theorems for finite blocklength lossy joint source-channel coding over an avc,”arXiv preprint arXiv:1907.05324, 2019

  32. [32]

    Information-theoretic approach to strategic communication as a hierarchical game,

    E. Akyol, C. Langbort, and T. Ba¸ sar, “Information-theoretic approach to strategic communication as a hierarchical game,”Proceedings of the IEEE, vol. 105, no. 2, pp. 205–218, 2016

  33. [33]

    Optimal channel design: A game theoretical analysis,

    M. Khouzani and P. Malacaria, “Optimal channel design: A game theoretical analysis,”Entropy, vol. 20, no. 9, p. 675, 2018

  34. [34]

    Nash codes for noisy channels,

    P. Hernández and B. von Stengel, “Nash codes for noisy channels,”Operations Research, vol. 62, no. 6, pp. 1221–1235, 2014

  35. [35]

    Non-stochastic information theory,

    A. Rangi and M. Franceschetti, “Non-stochastic information theory,”arXiv preprint arXiv:1904.11632, 2019

  36. [36]

    An introduction to partition logic,

    D. Ellerman, “An introduction to partition logic,”Logic Journal of IGPL, vol. 22, no. 1, pp. 94–125, 2014. 24

  37. [37]

    Coding of an information source having ambiguous alphabet and the entropy of graphs,

    J. Korner, “Coding of an information source having ambiguous alphabet and the entropy of graphs,” in6th Prague conference on Information Theory, etc.Academia, Prague, 1971, pp. 411–425

  38. [38]

    Coding-logic correspondence: Turning information and communication networks into logical formulae via hypergraph Heyting algebra,

    C. T. Li, “Coding-logic correspondence: Turning information and communication networks into logical formulae via hypergraph Heyting algebra,”arXiv preprint arXiv:2512.21112, 2025

  39. [39]

    Random coding theorem for broadcast channels with degraded components,

    P. Bergmans, “Random coding theorem for broadcast channels with degraded components,”IEEE Transactions on Information Theory, vol. 19, no. 2, pp. 197–207, 1973

  40. [40]

    Capacity and coding for degraded broadcast channels,

    R. G. Gallager, “Capacity and coding for degraded broadcast channels,”Problemy Peredachi Informatsii, vol. 10, no. 3, pp. 3–14, 1974

  41. [41]

    R. M. Fano,Transmission of Information. The MIT Press, 1961

  42. [42]

    Die formalen regeln der intuitionistischen logik,

    A. Heyting, “Die formalen regeln der intuitionistischen logik,”Sitzungsbericht PreuBische Akademie der Wissenschaften Berlin, physikalisch- mathematische Klasse II, pp. 42–56, 1930

  43. [43]

    On measures of entropy and information,

    A. Rényi, “On measures of entropy and information,” inProceedings of the Fourth Berkeley Symposium on Mathematical Statistics and Probability, Volume 1: Contributions to the Theory of Statistics. The Regents of the University of California, 1961

  44. [44]

    On the probability of large deviations of random magnitudes,

    I. N. Sanov, “On the probability of large deviations of random magnitudes,”Matematicheskii Sbornik, vol. 84, no. 1, pp. 11–44, 1957

  45. [45]

    Dembo and O

    A. Dembo and O. Zeitouni,Large deviations techniques and applications. Springer Science & Business Media, 2009, vol. 38

  46. [46]

    Entanglement-assisted capacity of a quantum channel and the reverse Shannon theorem,

    C. H. Bennett, P. W. Shor, J. Smolin, and A. V . Thapliyal, “Entanglement-assisted capacity of a quantum channel and the reverse Shannon theorem,” IEEE Transactions on Information Theory, vol. 48, no. 10, pp. 2637–2655, 2002

  47. [47]

    Distributed channel synthesis,

    P. Cuff, “Distributed channel synthesis,”IEEE Transactions on Information Theory, vol. 59, no. 11, pp. 7071–7096, Nov 2013

  48. [48]

    Channel simulation: Theory and applications to lossy compression and differential privacy,

    C. T. Li, “Channel simulation: Theory and applications to lossy compression and differential privacy,”Foundations and Trends® in Communications and Information Theory, vol. 21, no. 6, pp. 847–1106, 2024. [Online]. Available: http://dx.doi.org/10.1561/0100000141

  49. [49]

    On general minimax theorems,

    M. Sion, “On general minimax theorems,”Pacific Journal of Mathematics, vol. 8, no. 1, pp. 171–176, 1958

  50. [50]

    On the shannon capacity of an arbitrary channel,

    J. Kemperman, “On the shannon capacity of an arbitrary channel,” inIndagationes Mathematicae (Proceedings), vol. 77, no. 2. North-Holland, 1974, pp. 101–115