pith. sign in

arxiv: 2601.04041 · v2 · submitted 2026-01-07 · 💻 cs.IT · math.IT

Serving Every Symbol: All-Symbol PIR and Batch Codes

Pith reviewed 2026-05-16 16:19 UTC · model grok-4.3

classification 💻 cs.IT math.IT
keywords all-symbol PIR codesbatch codesfunctional batch codessimplex codeMDS codesminimum code lengthlinear codesprivate information retrieval
0
0 comments X

The pith

t-all-symbol PIR and batch codes recover any stored symbol from t disjoint server subsets, with minimum lengths determined for small k and t.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

This paper defines t-all-symbol PIR codes and batch codes as arrangements of n servers that store linear combinations of k information symbols such that any single symbol or any multiset of t symbols can be recovered from t pairwise disjoint subsets of servers. The authors compute the smallest possible n for several small pairs of k and t, describe the structure of the shortest codes, and prove new cases in which the simplex code satisfies the stronger functional batch-code property. The work unifies one-step majority-logic decodable codes, PIR codes, and batch codes under a single recovery rule and derives bounds relating code length, dimension, minimum distance, and t. A sympathetic reader cares because the framework supplies concrete optimal constructions and trade-off curves for distributed storage systems that must protect every stored piece of data equally.

Core claim

The paper determines the minimum code length n for t-all-symbol PIR and batch codes of dimension k for selected small values of k and t, characterizes structural properties of codes that attain this optimum, and derives bounds that relate length, dimension, minimum distance, and t. It further shows how MDS codes and the simplex code fit inside the framework and establishes additional cases of the open conjecture that the simplex code is a t-functional batch code.

What carries the argument

The t-all-symbol recovery property, in which any stored symbol (or multiset of t symbols) is recovered from t pairwise disjoint subsets of servers that each supply a linear equation for that symbol.

If this is right

  • Exact minimum lengths n are now known for several small (k,t) pairs.
  • Codes attaining the minimum exhibit specific linear dependence patterns among the server equations.
  • Trade-off bounds limit how small n can be when minimum distance or t is increased.
  • The simplex code meets the t-functional batch-code definition for additional values of t.

Where Pith is reading between the lines

These are editorial extensions of the paper, not claims the author makes directly.

  • The all-symbol requirement may force longer codes than ordinary PIR or batch codes but could improve worst-case retrieval latency in distributed systems.
  • The structural characterization could be used to test whether optimal codes for larger parameters follow the same design rules observed for small k and t.
  • Because the framework includes majority-logic decodable codes, fast decoding algorithms already known for those codes may apply directly to the new all-symbol variants.

Load-bearing premise

The servers store linear combinations of the k symbols and recovery of each symbol uses exactly t pairwise disjoint subsets.

What would settle it

A linear code with fewer than the claimed minimum servers for a concrete small pair such as k=3 and t=2 that still permits recovery of every symbol from two disjoint subsets would disprove the minimum-length result.

read the original abstract

A $t$-all-symbol PIR code and a $t$-all-symbol batch code of dimension $k$ consist of $n$ servers storing linear combinations of $k$ information symbols with the following recovery property: any symbol stored by a server can be recovered from $t$ pairwise disjoint subsets of servers. In the batch setting, we further require that any multiset of size $t$ of stored symbols can be recovered from~$t$ disjoint subsets of servers. This framework unifies and extends several well-known code families, including one-step majority-logic decodable codes, (functional) PIR codes, and (functional) batch codes. In this paper, we determine the minimum code length for some small values of $k$ and $t$, characterize structural properties of codes attaining this optimum, and derive bounds that show the trade-offs between length, dimension, minimum distance, and $t$. In addition, we study MDS codes and the simplex code, demonstrating how these classical families fit within our framework, and establish new cases of an open conjecture from \cite{YAAKOBI2020} concerning the minimal $t$ for which the simplex code is a $t$-functional batch code.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Referee Report

0 major / 3 minor

Summary. The paper introduces the notions of t-all-symbol PIR codes and t-all-symbol batch codes (including their functional variants) as a unifying framework that encompasses one-step majority-logic decodable codes, PIR codes, and batch codes. It determines the minimal code length n for several small pairs (k,t), characterizes structural properties of length-optimal codes, derives trade-off bounds relating n, k, minimum distance d, and t, examines how MDS codes and the simplex code fit the framework, and proves new cases of the open conjecture from YAAKOBI2020 on the minimal t for which the simplex code is a t-functional batch code.

Significance. If the explicit enumerations, constructions, and proofs hold, the work supplies concrete optimal parameters and structural characterizations for small instances together with explicit links to classical codes; these results can serve as base cases for inductive constructions and provide verifiable benchmarks for distributed storage and private retrieval applications.

minor comments (3)
  1. [Section presenting minimal lengths] §3 (or the section presenting the minimal-length tables): the optimality claims for small (k,t) rest on exhaustive search or enumeration; the manuscript should explicitly state the search space size, the verification method for the recovery property, and whether the search was exhaustive or heuristic.
  2. [Introduction / Definition section] The definition of t-all-symbol recovery (any stored symbol recoverable from t pairwise disjoint server subsets) is stated clearly in the abstract and introduction, but the transition to the multiset version for batch codes would benefit from a short illustrative example with k=3, t=2 to avoid ambiguity in the multiset case.
  3. [Bounds section] The trade-off bounds (relating n, k, d, t) are stated in a dedicated section; it would help to include a short table comparing the new bounds against the known Singleton-type or Plotkin-type bounds for the special cases t=1 and t=2.

Simulated Author's Rebuttal

0 responses · 0 unresolved

We thank the referee for the positive summary, significance assessment, and recommendation of minor revision. No specific major comments were raised in the report.

Circularity Check

0 steps flagged

No significant circularity detected

full rationale

The paper defines t-all-symbol PIR and batch codes from explicit linear-recovery properties over disjoint server subsets. Minimum lengths for small (k,t) are obtained via enumeration and constructions; structural properties and bounds follow directly from the recovery definition and minimum-distance relations. New cases of the YAAKOBI2020 conjecture are established by independent proofs for the simplex code. No step reduces by construction to a fitted parameter, self-definition, or unverified self-citation chain; the cited conjecture serves only as context for the new cases proved here.

Axiom & Free-Parameter Ledger

0 free parameters · 1 axioms · 0 invented entities

The framework rests on the standard assumption that codes are linear over finite fields and that recovery uses disjoint server subsets; no free parameters or invented entities are introduced beyond the new definitions.

axioms (1)
  • domain assumption Codes are linear combinations of k information symbols stored across n servers
    Stated in the definition of t-all-symbol PIR and batch codes.

pith-pipeline@v0.9.0 · 5521 in / 1127 out tokens · 46427 ms · 2026-05-16T16:19:19.541387+00:00 · methodology

discussion (0)

Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.

Lean theorems connected to this paper

Citations machine-checked in the Pith Canon. Every link opens the source theorem in the public Lean library.

What do these tags mean?
matches
The paper's claim is directly supported by a theorem in the formal canon.
supports
The theorem supports part of the paper's argument, but the paper may add assumptions or extra steps.
extends
The paper goes beyond the formal theorem; the theorem is a base layer rather than the whole result.
uses
The paper appears to rely on the theorem as machinery.
contradicts
The paper's claim conflicts with a theorem or certificate in the canon.
unclear
Pith found a possible connection, but the passage is too broad, indirect, or ambiguous to say the theorem truly supports the claim.

Reference graph

Works this paper leans on

23 extracted references · 23 canonical work pages · 3 internal anchors

  1. [1]

    Bounds on the length of functional PIR and batch codes,

    Y. Zhang, T. Etzion, and E. Yaakobi, “Bounds on the length of functional PIR and batch codes,”IEEE Transactions on Information Theory, vol. 66, no. 8, pp. 4917–4934, 2020

  2. [2]

    Batch codes and their appli- cations,

    Y. Ishai, E. Kushilevitz, R. Ostrovsky, and A. Sahai, “Batch codes and their appli- cations,” inProceedings of the Thirty-Sixth Annual ACM Symposium on Theory of Computing, STOC ’04, (New York, NY, USA), p. 262–271, Association for Computing Machinery, 2004

  3. [3]

    PIR with Low Storage Overhead: Coding instead of Replication

    A. Fazeli, A. Vardy, and E. Yaakobi, “PIR with low storage overhead: Coding instead of replication,”arXiv preprint arXiv:1505.06241, 2015

  4. [4]

    Codes for distributed PIR with low storage over- head,

    A. Fazeli, A. Vardy, and E. Yaakobi, “Codes for distributed PIR with low storage over- head,” in2015 IEEE International Symposium on Information Theory (ISIT), pp. 2852– 2856, IEEE, 2015

  5. [5]

    Private information retrieval without storage overhead: Coding instead of replication,

    A. Vardy and E. Yaakobi, “Private information retrieval without storage overhead: Coding instead of replication,”IEEE Journal on Selected Areas in Information Theory, vol. 4, pp. 286–301, 2023

  6. [6]

    Almost optimal construction of functional batch codes using extended simplex codes,

    L. Yohananov and E. Yaakobi, “Almost optimal construction of functional batch codes using extended simplex codes,”IEEE Transactions on Information Theory, vol. 68, no. 10, pp. 6434–6451, 2022

  7. [7]

    Combinatorial batch codes,

    D. R. Stinson, R. Wei, and M. B. Paterson, “Combinatorial batch codes,”Advances in Mathematics of Communications, vol. 3, no. 1, pp. 13–27, 2009

  8. [8]

    Combinatorial Batch Codes: A Lower Bound and Optimal Constructions

    S. Bhattacharya, S. Ruj, and B. Roy, “Combinatorial batch codes: A lower bound and optimal constructions,”arXiv preprint arXiv:1102.4951, 2011

  9. [9]

    Optimal combinatorial batch codes based on block designs,

    N. Silberstein and A. G´ al, “Optimal combinatorial batch codes based on block designs,” Designs, Codes and Cryptography, vol. 78, no. 2, pp. 409–424, 2016

  10. [10]

    Sparse hypergraphs with applications to coding theory,

    C. Shangguan and I. Tamo, “Sparse hypergraphs with applications to coding theory,” SIAM Journal on Discrete Mathematics, vol. 34, no. 3, pp. 1493–1504, 2020

  11. [11]

    Lower bounds for total storage of multiset com- binatorial batch codes using linear programming,

    Y. M. Chee, H. M. Kiah, and H. Zhang, “Lower bounds for total storage of multiset com- binatorial batch codes using linear programming,”IEEE Transactions on Information Theory, vol. 67, no. 1, pp. 255–267, 2020

  12. [12]

    Codes for network switches,

    Z. Wang, O. Shaked, Y. Cassuto, and J. Bruck, “Codes for network switches,” in2013 IEEE International Symposium on Information Theory, pp. 1057–1061, IEEE, 2013

  13. [13]

    Optimal binary switch codes with small query size,

    Z. Wang, H. M. Kiah, and Y. Cassuto, “Optimal binary switch codes with small query size,” in2015 IEEE International Symposium on Information Theory (ISIT), pp. 636– 640, IEEE, 2015

  14. [14]

    Combinatorial systematic switch codes,

    Y. M. Chee, F. Gao, S. T. H. Teo, and H. Zhang, “Combinatorial systematic switch codes,” in2015 IEEE International Symposium on Information Theory (ISIT), pp. 241– 245, IEEE, 2015

  15. [15]

    Consecutive switch codes,

    S. Buzaglo, Y. Cassuto, P. H. Siegel, and E. Yaakobi, “Consecutive switch codes,”IEEE Transactions on Information Theory, vol. 64, no. 4, pp. 2485–2498, 2017. 20

  16. [16]

    Bounds and constructions for generalized batch codes,

    X. Kong and O. Elishco, “Bounds and constructions for generalized batch codes,”IEEE Transactions on Information Theory, vol. 70, no. 10, pp. 6857–6876, 2024

  17. [17]

    Lower bounds on the redundancy of linear codes with disjoint repair groups,

    S. R. Karingula, A. Vardy, and M. Wootters, “Lower bounds on the redundancy of linear codes with disjoint repair groups,” in2022 IEEE International Symposium on Information Theory (ISIT), pp. 975–979, 2022

  18. [18]

    Lin and D

    S. Lin and D. J. Costello,Error control coding, vol. 2. Prentice hall Scarborough, 2001

  19. [19]

    Skachek,Batch and PIR Codes and Their Connections to Locally Repairable Codes, pp

    V. Skachek,Batch and PIR Codes and Their Connections to Locally Repairable Codes, pp. 427–442. Cham: Springer International Publishing, 2018

  20. [20]

    The length of functional batch and PIR codes,

    A. B. Kilic, A. Ravagnani, and F. Salizzoni, “The length of functional batch and PIR codes,”arXiv preprint arXiv:2508.02586, 2025

  21. [21]

    Constructions of batch codes with near-optimal redundancy,

    A. Vardy and E. Yaakobi, “Constructions of batch codes with near-optimal redundancy,” in2016 IEEE International Symposium on Information Theory (ISIT), pp. 1197–1201, 2016

  22. [22]

    Lower Bound on the Redundancy of PIR Codes

    S. Rao and A. Vardy, “Lower bound on the redundancy of PIR codes,”arXiv preprint arXiv:1605.01869, 2016

  23. [23]

    Switch codes: Codes for fully parallel reconstruction,

    Z. Wang, H. M. Kiah, Y. Cassuto, and J. Bruck, “Switch codes: Codes for fully parallel reconstruction,”IEEE Transactions on Information Theory, vol. 63, no. 4, pp. 2061– 2075, 2017. 21