Linear Code Conversion in the Merge Regime: General Bounds and Reed--Muller Constructions
Pith reviewed 2026-06-26 02:07 UTC · model grok-4.3
The pith
Universal lower bounds on write and read costs for converting arbitrary linear codes during merges are derived and attained by Reed-Muller constructions for write cost.
A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.
Core claim
For scalar linear code conversion in the merge regime, universal lower bounds exist on the write and read costs measured by unchanged and read symbols; these bounds can be refined using generalized Hamming weights to become strictly stronger when the final code exhibits nontrivial jumps in its generalized Hamming weight hierarchy, the framework recovers known bounds for important special cases, and Reed-Muller codes constructed via the Plotkin decomposition attain the derived write-cost lower bound in a natural parameter regime while the read-cost analysis is sharp for one initial block.
What carries the argument
Universal lower bounds on conversion costs refined by the generalized Hamming weights of the target code, which capture support-growth properties of its subcodes.
If this is right
- The bounds recover all previously known results for important special cases of code conversion.
- The refinement is strictly stronger than minimum-distance arguments exactly when the final code has jumps in its generalized Hamming weight hierarchy.
- Reed-Muller codes attain the write-cost lower bound for natural parameter regimes via the Plotkin decomposition.
- The generalized-Hamming-weight analysis is tight for read cost on one initial block but leaves a gap on the other block.
Where Pith is reading between the lines
- Storage systems could switch erasure-code parameters on the fly with far lower overhead than full re-encoding if these bounds are met by practical codes.
- The same bounding technique might be applied to conversion regimes other than merging or to non-linear codes.
- Other algebraic code families could be checked to see whether they also saturate the write-cost bound.
Load-bearing premise
That generalized Hamming weights yield strictly sharper cost estimates than minimum-distance arguments alone precisely when the target code has nontrivial jumps in its weight hierarchy.
What would settle it
An explicit pair of linear codes with a nontrivial generalized Hamming weight jump for which the minimum-distance-only bound already matches the refined bound, or a merge conversion whose measured write cost falls below the stated lower bound.
read the original abstract
Erasure codes are a core component of most existing large-scale distributed storage systems, ensuring reliability against node failures. Recent work has shown that adapting code parameters to changing node failure rates can lead to significant storage savings. The default approach is to re-encode the data under a new code, which consumes substantial system resources. Code conversion was introduced to reduce this cost. However, existing work has mainly focused on conversions within specific classes of codes. In this paper, we study scalar linear code conversion in the merge regime for arbitrary linear codes. We derive universal lower bounds on the write and read costs in terms of unchanged and read symbols. The bounds are refined using generalized Hamming weights, which capture support-growth properties of subcodes and can give sharper estimates than minimum-distance-only arguments. We show that the framework recovers known bounds for important special cases and can be strictly stronger when the final code has nontrivial jumps in its generalized Hamming weight hierarchy. We then apply the framework to Reed-Muller codes and construct explicit Reed-Muller convertible codes using the Plotkin decomposition. For a natural Reed-Muller parameter regime, the construction attains the derived write-cost lower bound. For the read cost, the generalized-Hamming-weight analysis is sharp for one initial block, while a gap remains for the other block.
Editorial analysis
A structured set of objections, weighed in public.
Referee Report
Summary. The paper studies scalar linear code conversion in the merge regime for arbitrary linear codes. It derives universal lower bounds on write and read costs in terms of unchanged and read symbols, refines these bounds using generalized Hamming weights to capture subcode support growth, shows that the framework recovers known bounds for special cases and can be strictly stronger for codes with nontrivial jumps in the GHW hierarchy, and constructs explicit Reed-Muller convertible codes via Plotkin decomposition that attain the write-cost lower bound in a natural parameter regime (with a remaining gap for read cost on one block).
Significance. If the derivations and attainability hold, the work provides a general framework extending code conversion beyond specific code families, enabling more efficient adaptation of erasure codes to varying node failure rates in distributed storage. Strengths include the explicit recovery of prior bounds, the use of generalized Hamming weights for sharper estimates when the hierarchy has jumps, and the explicit RM constructions meeting the write-cost bound.
minor comments (2)
- [Abstract] Abstract: the phrase 'natural Reed-Muller parameter regime' is used without a precise parameter range; a brief inline definition or reference to the relevant section would improve accessibility.
- [Abstract] The abstract notes a remaining gap for read cost on one initial block; a short statement in the introduction or conclusion on whether closing this gap is left for future work would clarify the scope.
Simulated Author's Rebuttal
We thank the referee for the positive summary, significance assessment, and recommendation of minor revision. No specific major comments were provided in the report.
Circularity Check
No significant circularity identified
full rationale
The paper derives universal lower bounds on write/read costs directly from general properties of linear codes and the definition of generalized Hamming weights, without any reduction to fitted parameters or self-referential inputs. The Reed-Muller construction applies the standard Plotkin decomposition to attain the write-cost bound in a stated regime, and the framework's recovery of known special-case bounds serves as a consistency check rather than a circular dependency. No load-bearing self-citations, uniqueness theorems imported from the authors' prior work, or ansatzes smuggled via citation are present in the derivation chain.
Axiom & Free-Parameter Ledger
axioms (2)
- standard math Linear codes are vector spaces over finite fields with standard support and weight properties
- domain assumption Generalized Hamming weights capture support-growth properties of subcodes
Reference graph
Works this paper leans on
-
[1]
Convertible codes for data and device heterogeneity,
A. Gruica, B. Jany, and S. Kruglik, “Convertible codes for data and device heterogeneity,” in2026 IEEE In- ternational Symposium on Information Theory (ISIT), 2026, pp. 1–5
2026
-
[2]
Erasure coding for distributed storage: An overview,
S. Balaji, M. N. Krishnan, M. Vajha, V . Ramkumar, B. Sasidharan, and P. V . Kumar, “Erasure coding for distributed storage: An overview,”Science China Infor- mation Sciences, vol. 61, no. 10, p. 100 301, 2018
2018
-
[3]
A survey of the past, present, and future of erasure coding for storage systems,
Z. Shen et al., “A survey of the past, present, and future of erasure coding for storage systems,”ACM Transactions on Storage, vol. 21, no. 1, pp. 1–39, 2025
2025
-
[4]
Roth,Introduction to coding theory
R. Roth,Introduction to coding theory. Cambridge University Press, 2006
2006
-
[5]
On the locality of codeword symbols,
P. Gopalan, C. Huang, H. Simitci, and S. Yekhanin, “On the locality of codeword symbols,”IEEE Transactions on Information Theory, vol. 58, no. 11, pp. 6925–6934, 2012
2012
-
[6]
On the service rate region of Reed–Muller codes,
H. Ly, E. Soljanin, and V . Lalitha, “On the service rate region of Reed–Muller codes,”IEEE Transactions on Information Theory, pp. 1–1, 2026
2026
-
[7]
Maximal achievable service rates of codes and connections to combinatorial de- signs,
H. Ly and E. Soljanin, “Maximal achievable service rates of codes and connections to combinatorial de- signs,”arXiv preprint arXiv:2506.16983, 2025
arXiv 2025
-
[8]
Bounds and constructions of codes with all-symbol locality and availability,
S. Kruglik and A. Frolov, “Bounds and constructions of codes with all-symbol locality and availability,” in2017 IEEE International Symposium on Information Theory (ISIT), IEEE, 2017, pp. 1023–1027
2017
-
[9]
Zigzag codes revisited: From optimal rebuilding to small skip cost and small fields,
W. Zhang, H. M. Kiah, and S. H. Dau, “Zigzag codes revisited: From optimal rebuilding to small skip cost and small fields,”arXiv preprint arXiv:2509.23090, 2025
arXiv 2025
-
[10]
Enabling optimal access and error correction for the repair of Reed– Solomon codes,
Z. Chen, M. Ye, and A. Barg, “Enabling optimal access and error correction for the repair of Reed– Solomon codes,”IEEE Transactions on Information Theory, vol. 66, no. 12, pp. 7439–7456, 2020
2020
-
[11]
Cluster storage systems gotta have HeART: Improving storage efficiency by exploiting disk-reliability heterogeneity,
S. Kadekodi, K. V . Rashmi, and G. R. Ganger, “Cluster storage systems gotta have HeART: Improving storage efficiency by exploiting disk-reliability heterogeneity,” inProceedings of the 17th USENIX Conference on File and Storage Technologies (FAST), 2019, pp. 345–358
2019
-
[12]
Convertible codes: En- abling efficient conversion of coded data in distributed storage,
F. Maturana and K. Rashmi, “Convertible codes: En- abling efficient conversion of coded data in distributed storage,”IEEE Transactions on Information Theory, vol. 68, no. 7, pp. 4392–4407, 2022
2022
-
[13]
Bandwidth cost of code conversions in distributed storage: Fundamental limits and optimal constructions,
F. Maturana and K. Rashmi, “Bandwidth cost of code conversions in distributed storage: Fundamental limits and optimal constructions,”IEEE Transactions on In- formation Theory, vol. 69, no. 8, pp. 4993–5008, 2023
2023
-
[14]
Access- optimal linear MDS convertible codes for all parame- ters,
F. Maturana, V . C. Mukka, and K. Rashmi, “Access- optimal linear MDS convertible codes for all parame- ters,” in2020 IEEE International Symposium on Infor- mation Theory (ISIT), IEEE, 2020, pp. 577–582
2020
-
[15]
On MDS convertible codes in the merge regime,
V . Ramkumar, X. Kong, G. Yeswanth Sai, M. Vajha, and M. Nikhil Krishnan, “On MDS convertible codes in the merge regime,”IEEE Transactions on Information Theory, pp. 1–1, 2026
2026
-
[16]
On low field size constructions of access-optimal convertible codes,
S. Chopra, F. Maturana, and K. Rashmi, “On low field size constructions of access-optimal convertible codes,” in2024 IEEE International Symposium on Information Theory (ISIT), 2024, pp. 1456–1461
2024
-
[17]
Bounds and optimal constructions of generalized merge-convertible codes for code conversion into LRCs,
H. Shi, W. Fang, and Y . Gao, “Bounds and optimal constructions of generalized merge-convertible codes for code conversion into LRCs,”IEEE Transactions on Information Theory, vol. 72, no. 5, pp. 2841–2860, 2026
2026
-
[18]
Locally repairable con- vertible codes: Improved lower bound and general con- struction,
S. Ge, H. Cai, and X. Tang, “Locally repairable con- vertible codes: Improved lower bound and general con- struction,”IEEE Transactions on Information Theory, vol. 72, no. 5, pp. 2915–2930, 2026
2026
-
[19]
Locally repairable convertible codes with op- timal access costs,
X. Kong, “Locally repairable convertible codes with op- timal access costs,”IEEE Transactions on Information Theory, vol. 70, no. 9, pp. 6239–6257, 2024
2024
-
[20]
MDS generalized convert- ible code,
S. Ge, H. Cai, and X. Tang, “MDS generalized convert- ible code,”arXiv preprint arXiv:2407.14304, 2024
arXiv 2024
-
[21]
Generalized Hamming weights for lin- ear codes,
V . K. Wei, “Generalized Hamming weights for lin- ear codes,”IEEE Transactions on Information Theory, vol. 37, no. 5, pp. 1412–1418, 1991
1991
-
[22]
On the generalized Hamming weights of (r,δ)-locally repairable codes,
J. Hao and B. Chen, “On the generalized Hamming weights of (r,δ)-locally repairable codes,”IEEE Access, vol. 8, pp. 149 706–149 713, 2020
2020
-
[23]
F. J. MacWilliams and N. J. A. Sloane,The Theory of Error-Correcting Codes. Elsevier, 1977, vol. 16
1977
-
[24]
Reed–Muller codes: Theory and algorithms,
E. Abbe, A. Shpilka, and M. Ye, “Reed–Muller codes: Theory and algorithms,”IEEE Transactions on Infor- mation Theory, vol. 67, no. 6, pp. 3251–3277, 2020
2020
discussion (0)
Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.