arxiv: 2604.10457 · v1 · submitted 2026-04-12 · 💻 cs.CC · cs.DS

Recognition: 2 theorem links

· Lean Theorem

Near Optimal Algorithms for Noisy k-XOR under Low-Degree Heuristic

Songtao Mao

Authors on Pith no claims yet

Pith reviewed 2026-05-10 16:39 UTC · model grok-4.3

classification 💻 cs.CC cs.DS

keywords noisy k-XORrecovery algorithmlow-degree heuristicsecond-moment methodcolor codingdynamic programminghypergraph embedding

0 comments

The pith

A recovery algorithm for noisy k-XOR succeeds with m samples above C n^{k/2} / (D^{k/2-1} δ²) in time n^D.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

The paper constructs an explicit algorithm that recovers the hidden Boolean assignment from random noisy k-ary parity constraints. The running time is n to the power D plus lower-order terms, and the algorithm succeeds as soon as the number of constraints meets the displayed threshold, which improves on prior detection-only results by also guaranteeing recovery. The same threshold is shown to be tight for low-degree methods via a matching second-moment calculation on the likelihood ratio.

Core claim

We give a recovery algorithm for noisy k-XOR that runs in time n^{D+O(1)} and succeeds whenever m ≥ C_k n^{k/2} / (D^{k/2-1} δ²). We also prove that the degree-D low-degree likelihood ratio has bounded L²-norm below the same threshold up to the factor D^{k/2-1}.

What carries the argument

Refined second-moment analysis of structured hypergraph embedding statistics, implemented with color coding and dynamic programming.

If this is right

Recovery and detection thresholds coincide up to the explicit D factor.
The dependence on the noise bias δ is information-theoretically optimal up to constant factors.
Under the low-degree heuristic the algorithm is near-optimal across a broad range of D and δ.
The same second-moment technique yields matching lower bounds for the low-degree likelihood ratio.

Where Pith is reading between the lines

These are editorial extensions of the paper, not claims the author makes directly.

The color-coding and dynamic-programming technique for hypergraph statistics may extend to other average-case problems on random hypergraphs.
If the low-degree heuristic holds, the result characterizes the full computational threshold curve for noisy k-XOR.
The explicit constant C_k and the precise D exponent invite numerical verification on moderate-sized instances.

Load-bearing premise

The low-degree heuristic accurately reflects computational hardness for noisy k-XOR.

What would settle it

An explicit low-degree polynomial that distinguishes the planted and null distributions below the stated threshold, or a polynomial-time algorithm that recovers the assignment above the threshold but with smaller D dependence.

read the original abstract

Noisy $k$-XOR is a basic average-case inference problem in which one observes random noisy $k$-ary parity constraints and seeks to recover, or more weakly, detect, a hidden Boolean assignment. A central question is to characterize the tradeoff among sample complexity, noise level, and running time. We give a recovery algorithm, and hence also a detection algorithm, for noisy $k$-XOR in the high-noise regime. For every parameter $D$, our algorithm runs in time $n^{D+O(1)}$ and succeeds whenever $$ m \ge C_k \frac{n^{k/2}}{D^{\,k/2-1}\delta^2}, $$ where $C_k$ is an explicit constant depending only on $k$, and $\delta$ is the noise bias. Our result matches the best previously known time--sample tradeoff for detection, while simultaneously yielding recovery guarantees. In addition, the dependence on the noise bias $\delta$ is optimal up to constant factors, matching the information-theoretic scaling. We also prove matching low-degree lower bounds. In particular, we show that the degree-$D$ low-degree likelihood ratio has bounded $L^2$-norm below the same threshold, up to the same factor $D^{k/2-1}$. Under the low-degree heuristic, this implies that our algorithm is near-optimal over a broad range of parameters. Our approach combines a refined second-moment analysis with color coding and dynamic programming for structured hypergraph embedding statistics. These techniques may be of independent interest for other average-case inference problems.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Desk Editor's Note private letter to a colleague

This paper gives a recovery algorithm for noisy k-XOR that hits the prior detection threshold with explicit time-sample tradeoffs and matching low-degree lower bounds.

read the letter

Hi, the main thing to know is that the authors close the recovery-detection gap for noisy k-XOR. They give an algorithm that recovers the hidden assignment at the same sample threshold where detection was previously possible, with runtime n^{D+O(1)} and m at least C_k n^{k/2} / (D^{k/2-1} delta^2). They also prove a matching low-degree lower bound by direct L2 calculation on the degree-D likelihood ratio. This is new because earlier work stopped at detection only, and the delta scaling is information-theoretically tight up to constants. The construction uses color coding plus dynamic programming on hypergraph embeddings, paired with a refined second-moment analysis to control the success probability. The lower bound calculation stands on its own without depending on the algorithm, which keeps things clean. The techniques look reusable for other planted problems with similar structure. One minor soft spot is that C_k and the hidden factors in the O(1) runtime term are not analyzed for size, so practical constants could be large even if the asymptotics are good. The optimality claim also rests on the low-degree heuristic being accurate here, which is the usual caveat in this area but not a flaw in the paper itself. No internal inconsistencies show up in the moment calculations or embedding statistics. This is for people working on average-case complexity, planted CSPs, and low-degree methods. Readers who care about tight tradeoffs in inference problems will find the explicit parameters and matching bounds useful. It deserves a serious referee because the result is precise, the upper and lower bounds line up, and the algorithmic ideas are concrete enough to check and extend.

Referee Report

0 major / 3 minor

Summary. The paper proposes a recovery algorithm for the noisy k-XOR problem in the high-noise regime. For any parameter D, the algorithm runs in time n^{D + O(1)} and recovers the hidden assignment with high probability whenever the number of samples m satisfies m ≥ C_k n^{k/2} / (D^{k/2 - 1} δ²), where C_k is an explicit constant depending only on k. The approach relies on color coding and dynamic programming for computing structured hypergraph embedding statistics, combined with a refined second-moment analysis. Additionally, the paper proves matching lower bounds by showing that the L² norm of the degree-D low-degree likelihood ratio is bounded below the same threshold (up to the D^{k/2-1} factor). Under the low-degree heuristic, this establishes near-optimality of the algorithm.

Significance. This result is significant because it provides the first recovery guarantees that match the best known detection thresholds for noisy k-XOR, while also achieving the information-theoretically optimal dependence on the noise bias δ up to constants. The explicit nature of C_k and the matching low-degree lower bounds are strengths, as they make the result falsifiable and precise. The techniques of refined second-moment analysis on color-coded embeddings and DP for hypergraph statistics could be of independent interest for other average-case problems in computational complexity and statistical inference. If the moment calculations hold, it strengthens the evidence for the low-degree heuristic in this domain.

minor comments (3)

[Abstract] The abstract states that C_k is 'an explicit constant depending only on k', but providing the closed-form expression for C_k in the abstract or introduction would improve readability and allow immediate verification of the constant factor.
[Lower Bounds section] The low-degree lower bound is obtained by direct calculation of the likelihood ratio norm. Clarifying the exact definition of the degree-D polynomials used in the L² norm computation would help readers follow the matching to the algorithmic threshold.
A table or figure summarizing the time-sample tradeoffs for varying D, k, and δ would aid in comparing to prior detection-only results.

Simulated Author's Rebuttal

0 responses · 0 unresolved

We thank the referee for their positive assessment of our work, accurate summary of the contributions, and recommendation for minor revision. The referee's comments on significance and potential interest of the techniques are appreciated. Since the report lists no specific major comments, we have no individual points to address or rebut.

Circularity Check

0 steps flagged

No significant circularity in derivation chain

full rationale

The paper constructs an explicit recovery algorithm for noisy k-XOR via color coding and dynamic programming on structured hypergraph embeddings, with success probability controlled by a refined second-moment analysis. Independently, it computes the L^2 norm of the degree-D low-degree likelihood ratio by direct calculation, establishing a matching threshold up to the factor D^{k/2-1}. These steps are separate; the algorithmic guarantee does not reduce to the lower-bound calculation (or vice versa) by definition, fitting, or self-citation. The low-degree heuristic is invoked only for interpretation of near-optimality, not as a load-bearing premise in the derivations themselves. No ansatzes, renamings, or uniqueness theorems from prior self-work appear in the core claims.

Axiom & Free-Parameter Ledger

0 free parameters · 2 axioms · 0 invented entities

The central claim rests on the standard probabilistic analysis of the second-moment method for the algorithm's success and on the explicit computation of the low-degree likelihood ratio. No new entities or fitted parameters beyond the explicit constant C_k are introduced.

axioms (2)

domain assumption The input consists of independent noisy k-XOR constraints with fixed bias δ.
This is the standard definition of the noisy k-XOR model used throughout the abstract.
standard math Color coding and dynamic programming correctly compute the required hypergraph embedding statistics in the stated runtime.
These are established algorithmic primitives whose correctness is assumed from prior literature.

pith-pipeline@v0.9.0 · 5583 in / 1456 out tokens · 55253 ms · 2026-05-10T16:39:36.712138+00:00 · methodology

discussion (0)

Lean theorems connected to this paper

Citations machine-checked in the Pith Canon. Every link opens the source theorem in the public Lean library.

IndisputableMonolith/Cost/FunctionalEquation.lean washburn_uniqueness_aczel unclear

?

unclear
Relation between the paper passage and the cited Recognition theorem.

Our approach combines a refined second-moment analysis with color coding and dynamic programming for structured hypergraph embedding statistics.
IndisputableMonolith/Foundation/RealityFromDistinction.lean reality_from_one_distinction unclear

?

unclear
Relation between the paper passage and the cited Recognition theorem.

We also prove matching low-degree lower bounds... the degree-D low-degree likelihood ratio has bounded L2-norm

What do these tags mean?

matches: The paper's claim is directly supported by a theorem in the formal canon.
supports: The theorem supports part of the paper's argument, but the paper may add assumptions or extra steps.
extends: The paper goes beyond the formal theorem; the theorem is a base layer rather than the whole result.
uses: The paper appears to rely on the theorem as machinery.
contradicts: The paper's claim conflicts with a theorem or certificate in the canon.
unclear: Pith found a possible connection, but the passage is too broad, indirect, or ambiguous to say the theorem truly supports the claim.

Reference graph

Works this paper leans on

17 extracted references · 7 canonical work pages

[1]

Strongly refuting all semi-random boolean csps

[AGK21] Jackson Abascal, Venkatesan Guruswami, and Pravesh K Kothari. Strongly refuting all semi-random boolean csps. InProceedings of the 2021 ACM-SIAM Symposium on Discrete Algorithms (SODA), pages 454–472. SIAM,

2021
[2]

Near-optimal time-sparsity trade-offs for solving noisy linear equations

[BBTV25] Kiril Bangachev, Guy Bresler, Stefan Tiegel, and Vinod Vaikuntanathan. Near-optimal time-sparsity trade-offs for solving noisy linear equations. InProceedings of the 57th Annual ACM Symposium on Theory of Computing, pages 1910–1920,

1910
[3]

Approximating the number of relevant variables in a parity implies proper learning

[BH24] Nader H Bshouty and George Haddad. Approximating the number of relevant variables in a parity implies proper learning. InApproximation, Randomization, and Combina- torial Optimization. Algorithms and Techniques (APPROX/RANDOM 2024), pages 38–1. Schloss Dagstuhl–Leibniz-Zentrum f¨ ur Informatik,

2024
[4]

Average-case reductions fork-xor and tensor pca

[BH26] Guy Bresler and Alina Harbuzova. Average-case reductions fork-xor and tensor pca. arXiv preprint arXiv:2601.19016,

work page arXiv
[5]

The quasi- polynomial low-degree conjecture is false.arXiv preprint arXiv:2505.17360, 2025

[BHJK25] Rares-Darius Buhai, Jun-Ting Hsieh, Aayush Jain, and Pravesh K Kothari. The quasi- polynomial low-degree conjecture is false.arXiv preprint arXiv:2505.17360,

work page arXiv
[6]

Lin, and Peter Manohar

41 [BHLM25] Arpon Basu, Jun-Ting Hsieh, Andrew D. Lin, and Peter Manohar. Solving random planted CSPs below then k/2 threshold.arXiv preprint arXiv:2507.10833,

work page arXiv
[7]

Computational hardness of certifying bounds on constrained pca problems

[BKW20] Afonso S Bandeira, Dmitriy Kunisky, and Alexander S Wein. Computational hardness of certifying bounds on constrained pca problems. In11th Innovations in Theoretical Computer Science Conference (ITCS 2020), volume 151,

2020
[8]

Detecting correlation efficiently in very supercritical stochastic block models: Breaking the otter’s thresh- old barrier

[CDGL26b] Guanyi Chen, Jian Ding, Shuyang Gong, and Zhangsong Li. Detecting correlation efficiently in very supercritical stochastic block models: Breaking the otter’s thresh- old barrier. InProceedings of the 2026 Annual ACM-SIAM Symposium on Discrete Algorithms (SODA), pages 2743–2759. SIAM,

2026
[9]

An efficient sparse regularity concept.SIAM Journal on Discrete Mathematics, 23(4):2000–2034,

[COCF10] Amin Coja-Oghlan, Colin Cooper, and Alan Frieze. An efficient sparse regularity concept.SIAM Journal on Discrete Mathematics, 23(4):2000–2034,

2000
[10]

Sub-exponential approxima- tion schemes for csps: From dense to almost sparse

[FLP16] Dimitris Fotakis, Michail Lampis, and Vangelis Paschos. Sub-exponential approxima- tion schemes for csps: From dense to almost sparse. In33rd Symposium on Theoretical Aspects of Computer Science (STACS 2016), pages 37–1,

2016
[11]

Exploration is harder than predic- tion: Cryptographically separating reinforcement learning from supervised learning

[GMR24] Noah Golowich, Ankur Moitra, and Dhruv Rohatgi. Exploration is harder than predic- tion: Cryptographically separating reinforcement learning from supervised learning. In2024 IEEE 65th Annual Symposium on Foundations of Computer Science (FOCS), pages 1953–1967. IEEE,

1953
[12]

A simple and sharper proof of the hypergraph moore bound

[HKM23] Jun-Ting Hsieh, Pravesh K Kothari, and Sidhanth Mohanty. A simple and sharper proof of the hypergraph moore bound. InProceedings of the 2023 Annual ACM-SIAM Symposium on Discrete Algorithms (SODA), pages 2324–2344. SIAM,

2023
[13]

Bayesian estimation from few samples: community detection and related problems

[HS17a] Samuel B Hopkins and David Steurer. Bayesian estimation from few samples: com- munity detection and related problems.arXiv preprint arXiv:1710.00264,

work page Pith review arXiv
[14]

Counterexamples to the low-degree conjec- ture

[HW21] Justin Holmgren and Alexander S Wein. Counterexamples to the low-degree conjec- ture. In12th Innovations in Theoretical Computer Science Conference (ITCS 2021), pages 75–1. Schloss Dagstuhl–Leibniz-Zentrum f¨ ur Informatik,

2021
[15]

The algorithmic phase transition in correlated spiked models.arXiv preprint arXiv:2511.06040,

[Li25a] Zhangsong Li. The algorithmic phase transition in correlated spiked models.arXiv preprint arXiv:2511.06040,

work page arXiv
[16]

A smooth computational transition in tensor pca.arXiv preprint arXiv:2509.09904,

[Li25b] Zhangsong Li. A smooth computational transition in tensor pca.arXiv preprint arXiv:2509.09904,

work page arXiv
[17]

Optimal spectral recovery of a planted vector in a subspace.arXiv preprint arXiv:2105.15081,

[MW21] Cheng Mao and Alexander S Wein. Optimal spectral recovery of a planted vector in a subspace.arXiv preprint arXiv:2105.15081,

work page arXiv