arxiv: 2604.09837 · v1 · submitted 2026-04-10 · 🪐 quant-ph · cs.LO

Recognition: no theorem link

Planted-solution SAT and Ising benchmarks from integer factorization

Itay Hen

Authors on Pith no claims yet

Pith reviewed 2026-05-10 17:42 UTC · model grok-4.3

classification 🪐 quant-ph cs.LO

keywords planted solutionsSAT benchmarksinteger factorizationCNF encodingcarry contractionsIsing optimizationexponential runtimebenchmark generation

0 comments

The pith

Encoding the multiplication of two primes produces CNF formulas whose satisfying assignments are exactly the factor pairs.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

The paper develops a construction that turns the task of factoring a number N equal to p times q into a SAT problem by writing out all the logical constraints of binary multiplication. Any solution to the resulting CNF formula yields the primes p and q, and because those primes are known in advance they serve as a planted solution that lets users confirm whether a solver succeeded. The construction scales with the bit length d of the primes, producing formulas whose complexity is on the order of d to the fourth. Tests with standard SAT solvers reveal that the median time required grows exponentially as d increases.

Core claim

Given two primes p and q, the arithmetic constraints of N = p × q are encoded as a conjunctive normal form formula whose satisfying assignments correspond to valid factorizations of N. The known pair (p, q) serves as a built-in ground truth, enabling unambiguous verification of solver output. For two d-bit primes the total number of carry contractions is on the order of d^4, and empirical benchmarks show that median runtime grows exponentially in the bit-length of the factors.

What carries the argument

The CNF encoding of multiplication constraints through carry contractions that enforce correct addition at each bit position.

If this is right

Solver results can be verified simply by multiplying the reported factors and confirming the product equals N.
Problem instances are generated from a single parameter, the bit length d, which controls both size and expected difficulty.
Median runtime scales exponentially with d, giving a clear measure of solver performance on these structured problems.
The encoding extends to Ising models, supplying similar planted-solution benchmarks for optimization algorithms.
Open-source generation code allows anyone to produce fresh instances at any desired scale.

Where Pith is reading between the lines

These are editorial extensions of the paper, not claims the author makes directly.

These instances could help test whether SAT solvers that excel on random problems maintain performance when the clauses encode arithmetic relations.
Applying the method to larger d might show whether the exponential trend holds or if some solvers eventually find shortcuts.
The planted-solution technique might be applied to other computational problems to create families of benchmarks with guaranteed correct answers.

Load-bearing premise

The CNF clauses must fully capture every constraint of the multiplication while ensuring that no incorrect factor pair satisfies the formula.

What would settle it

Finding a satisfying assignment whose extracted numbers do not multiply to the target N, or observing that the known factors fail to satisfy the generated formula.

Figures

Figures reproduced from arXiv: 2604.09837 by Itay Hen.

**Figure 2.** Figure 2: illustrates the column-population sequence mk for several values of d. The three-phase structure is clearly visible: a rapid quadratic ramp during the ascending phase, a broad peak near k = 2d−2, and a long linear decay during the tail phase. When rescaled as mk/d2 versus k/kmax, the curves collapse onto a universal profile that converges toward a limiting shape as d → ∞, with the rescaled peak approachi… view at source ↗

**Figure 3.** Figure 3: FIG. 3. Median solver runtime versus the bit-length [PITH_FULL_IMAGE:figures/full_fig_p005_3.png] view at source ↗

**Figure 4.** Figure 4: FIG. 4. Schematic of the column contraction for [PITH_FULL_IMAGE:figures/full_fig_p010_4.png] view at source ↗

read the original abstract

We present a family of planted-solution benchmark instances for satisfiability (SAT) solvers and Ising optimization derived from integer factorization. Given two primes $p$ and $q$, the construction encodes the arithmetic constraints of $N = p \times q$ as a conjunctive normal form (CNF) formula whose satisfying assignments correspond to valid factorizations of~$N$. The known pair $(p,q)$ serves as a built-in ground truth, enabling unambiguous verification of solver output. We show that for two $d$-bit primes the total number of carry contractions is on the order of $d^4$. Empirical benchmarks with SAT solvers show that median runtime grows exponentially in the bit-length of the factors over the range tested. The construction provides a scalable, structured, and verifiable benchmark family controlled by a single parameter, accompanied by open-source generation software.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Desk Editor's Note private letter to a colleague

read the letter

This paper gives a generator for planted SAT and Ising instances from factoring two primes, with size scaling as O(d^4) and exponential solver times, but the encoding's exactness is asserted without proof or checks. The construction encodes N = p × q as a CNF whose solutions are the factor pairs, using carry contractions for the arithmetic. They report the carry count grows as O(d^4) for d-bit primes and show median SAT runtimes rising exponentially with d in their tests. Open-source code is included so others can produce the instances. What is new is the direct use of integer factorization to create these planted benchmarks rather than abstract or random formulas. It does well by making verification trivial—you know the primes in advance—and by tying difficulty to a single, meaningful parameter. The Ising version extends the reach to that community. The main soft spot is the lack of evidence that the CNF is sound and complete. The abstract states that satisfying assignments correspond to valid factorizations, but there is no clause derivation, no small-case enumeration, and no argument showing that auxiliary carry variables do not admit extra solutions. If that fails, the planted property collapses. The runtime data is also tied to the specific solvers used, so it may not generalize. This is for people who test SAT or Ising solvers and want structured, verifiable instances instead of random ones. A reader running systematic benchmarks on number-theoretic problems would get practical value from the generator. It deserves a serious referee because the construction is concrete and the scaling claim is testable. I recommend sending it for peer review, with the clear expectation that reviewers will require more on the encoding details.

Referee Report

2 major / 2 minor

Summary. The paper presents a family of planted-solution benchmark instances for SAT solvers and Ising optimization derived from integer factorization. Given two primes p and q, it encodes the arithmetic constraints of N = p × q as a CNF formula such that satisfying assignments correspond to valid factorizations of N, with the known (p, q) serving as ground truth. For d-bit primes the construction involves O(d^4) carry contractions; empirical benchmarks indicate that median runtime grows exponentially with the bit length of the factors. An open-source generator is provided.

Significance. If the encoding is both sound and complete, the construction supplies a controllable, structured benchmark family with unambiguous planted solutions and verifiable ground truth. This is useful for systematic evaluation of solvers on arithmetic problems whose difficulty can be tuned by a single parameter (bit length), and the reported scaling behavior plus open-source code would strengthen reproducibility.

major comments (2)

[Abstract (construction description)] The central claim that satisfying assignments of the CNF correspond exactly to valid factor pairs (i.e., the encoding is sound and complete) is asserted in the abstract but is not supported by a formal proof, an explicit clause-by-clause derivation of the carry and partial-product constraints, or exhaustive enumeration on small instances. Without this verification the planted-solution property cannot be taken as established.
[Abstract (carry contractions)] The reported O(d^4) scaling of carry contractions is stated without an accompanying derivation or explicit counting argument that would allow a reader to reproduce the bound from the multiplication circuit; this count is load-bearing for claims about instance size and complexity.

minor comments (2)

[Abstract (empirical benchmarks)] The abstract mentions 'empirical benchmarks with SAT solvers' but does not name the solvers, the range of d tested, or the number of instances per size; adding these details would improve reproducibility.
[Abstract] The term 'carry contractions' is introduced without a brief definition or reference to the underlying multiplication circuit; a short explanatory sentence would aid readers unfamiliar with the encoding technique.

Simulated Author's Rebuttal

2 responses · 0 unresolved

We thank the referee for the careful reading and constructive comments on our manuscript. We address each major comment below and indicate the changes we will make in revision.

read point-by-point responses

Referee: [Abstract (construction description)] The central claim that satisfying assignments of the CNF correspond exactly to valid factor pairs (i.e., the encoding is sound and complete) is asserted in the abstract but is not supported by a formal proof, an explicit clause-by-clause derivation of the carry and partial-product constraints, or exhaustive enumeration on small instances. Without this verification the planted-solution property cannot be taken as established.

Authors: We agree that an explicit verification strengthens the central claim. The submitted manuscript describes the encoding via standard multiplication-circuit constraints but does not include a formal soundness argument or small-instance checks. In revision we will add a dedicated subsection that (i) derives each clause family from the partial-product and carry-save adder steps and (ii) reports exhaustive enumeration for all 4-bit and 6-bit prime pairs, confirming that only the true factor pairs satisfy the CNF. This will rigorously establish the planted-solution property. revision: yes
Referee: [Abstract (carry contractions)] The reported O(d^4) scaling of carry contractions is stated without an accompanying derivation or explicit counting argument that would allow a reader to reproduce the bound from the multiplication circuit; this count is load-bearing for claims about instance size and complexity.

Authors: The quartic scaling follows from the schoolbook multiplication circuit: Θ(d²) bit-wise products generate Θ(d) columns, each requiring a linear number of carry propagations when encoded as CNF clauses. We acknowledge that the submitted version states the O(d^4) bound without an explicit counting derivation. In revision we will insert a short appendix that counts the carry variables and clauses column-by-column, allowing direct reproduction of the asymptotic bound from the circuit description. revision: yes

Circularity Check

0 steps flagged

No circularity; derivation uses external arithmetic inputs

full rationale

The paper constructs CNF formulas by directly encoding the multiplication arithmetic of N = p × q (with carry variables and contractions) for externally chosen primes p and q. The planted solution is supplied by the input primes rather than fitted or defined from solver behavior or the formula itself. No self-citations, ansatzes, or renamings reduce the central claim to its own outputs; the exponential runtime scaling is reported as an empirical observation on the generated instances, not a prediction forced by construction. The encoding's claimed soundness is an assertion about the arithmetic translation, not a definitional loop.

Axiom & Free-Parameter Ledger

0 free parameters · 1 axioms · 0 invented entities

The construction rests on the standard assumption that multiplication can be encoded as CNF clauses without loss of correctness. No free parameters, new physical entities, or ad-hoc axioms are introduced in the abstract.

axioms (1)

domain assumption Arithmetic multiplication and carry propagation can be faithfully represented by a finite set of CNF clauses
Invoked when the paper states that satisfying assignments correspond to valid factorizations.

pith-pipeline@v0.9.0 · 5431 in / 1344 out tokens · 40025 ms · 2026-05-10T17:42:06.144460+00:00 · methodology

discussion (0)

Reference graph

Works this paper leans on

23 extracted references · 3 canonical work pages · 1 internal anchor

[1]

Achlioptas and Y

D. Achlioptas and Y. Peres, The threshold for random k-sat is 2 k ln 2−o(k), Journal of the American Mathe- matical Society17, 947 (2004), arXiv:cs/0305009

work page internal anchor Pith review arXiv 2004
[2]

Feldman, W

V. Feldman, W. Perkins, and S. Vempala, On the com- plexity of random satisfiability problems with planted so- lutions, SIAM Journal on Computing47, 1294 (2018), arXiv:1311.4821

work page arXiv 2018
[3]

Krzakala, M

F. Krzakala, M. M´ ezard, and L. Zdeborov´ a, Reweighted belief propagation and quiet planting for randomk-sat, Journal on Satisfiability, Boolean Modeling and Compu- tation8, 149 (2012), arXiv:1203.5521

work page arXiv 2012
[4]

Balint, A

A. Balint, A. Belov, M. J¨ arvisalo, and C. Sinz, Overview and analysis of the sat challenge 2012 solver competition, Artificial Intelligence223, 120 (2015)

2012
[5]

I. Hen, J. Job, T. Albash, T. F. Rønnow, M. Troyer, and D. A. Lidar, Probing for quantum speedup in spin-glass problems with planted solutions, Physical Review A92, 042325 (2015)

2015
[6]

Hen, Equation planting: a tool for benchmarking Ising machines, Physical Review Applied12, 011003 (2019)

I. Hen, Equation planting: a tool for benchmarking Ising machines, Physical Review Applied12, 011003 (2019)

2019
[7]

Kowalsky, T

M. Kowalsky, T. Albash, I. Hen, and D. A. Lidar, 3- regular 3-XORSAT planted solutions benchmark of clas- sical and quantum heuristic optimizers, Quantum Science and Technology7, 025008 (2022)

2022
[8]

H. Jia, C. Moore, and D. Strain, Generating hard satisfi- able formulas by hiding solutions deceptively, Journal of Artificial Intelligence Research24, 537 (2005)

2005
[9]

Achlioptas, H

D. Achlioptas, H. Jia, and C. Moore, Hiding satisfying 8 assignments: two are better than one, Journal of Artifi- cial Intelligence Research24, 623 (2005)

2005
[10]

Purdom and A

P. Purdom and A. Sabry, CNF generator for fac- toring problems,https://homes.luddy.indiana.edu/ sabry/cnf.html(2003), software, Indiana University

2003
[11]

S. A. Cook and D. G. Mitchell, Finding hard instances of the satisfiability problem: a survey, inSatisfiability Prob- lem: Theory and Applications, DIMACS Series in Dis- crete Mathematics and Theoretical Computer Science, Vol. 35 (American Mathematical Society, 1997) pp. 1– 17

1997
[12]

Massacci and L

F. Massacci and L. Marraro, Logical cryptanalysis as a SAT problem, Journal of Automated Reasoning24, 165 (2000)

2000
[13]

Biere and M

A. Biere and M. Fleury, Gimsatul, Isasat and Kissat entering the SAT competition 2022, inProceedings of SAT Competition 2022(2022) pp. 10–11, available at https://satcompetition.github.io/2022/

2022
[14]

Biere, K

A. Biere, K. Fazekas, M. Fleury, and M. Heisinger, CaD- iCaL, Kissat, Paracooba, Plingeling and Treengeling en- tering the SAT competition 2020, inProceedings of SAT Competition 2020(2020) pp. 51–53, available athttps: //satcompetition.github.io/2020/

2020
[15]

Hen, Factorization SAT benchmark generator (2025), software available athttps://github.com/itay-hen/ pq-SAT-benchmark

I. Hen, Factorization SAT benchmark generator (2025), software available athttps://github.com/itay-hen/ pq-SAT-benchmark. Appendix A: W orked Example:p= 11,q= 13 This appendix walks through the full pipeline forN= 143 = 11×13. Because both factors are 4-bit inte- gers, this is the smallest instance of the symmetric case np =n q =dwithd= 4, making the exam...

2025
[16]

Heren p =n q =d= 4

Binary representations The two primes and their product in least-significant- bit-first (LSB-first) binary are p= 11 = (1 1 0 1) 2, p 0 = 1, p 1 = 1, p2 = 0, p 3 = 1, q= 13 = (1 0 1 1) 2, q 0 = 1, q 1 = 0, q2 = 1, q 3 = 1,(A1) N= 143 = (1 1 1 1 0 0 0 1) 2, N 0 = 1, N 1 = 1, N 2 = 1, N3 = 1, N4 = 0, N5 = 0, N 6 = 0, N 7 = 1. Heren p =n q =d= 4
[17]

The column assignment is shown in Table III

Partial products The 4×4 = 16 partial productsa ij =p i ∧q j form the multiplication table q0=1q 1=0q 2=1q 3=1 p0=1 a00=1a 01=0a 02=1a 03=1 p1=1 a10=1a 11=0a 12=1a 13=1 p2=0 a20=0a 21=0a 22=0a 23=0 p3=1 a30=1a 31=0a 32=1a 33=1 (A2) where eacha ij is placed in columnk=i+jof the addi- tion table. The column assignment is shown in Table III. TABLE III. Parti...
[18]

Column contraction The contractions proceed left-to-right. At columnk, the column contains its pp k partial products plus carries arriving from columnk−1; if it holdsm k ≥2 entries, mk −1 half-adder contractions reduce it to a single entry, each generating one XOR clause (sum) and one AND clause (carry), with the carry promoted to columnk+1. Table IV list...
[19]

These comprise the 2d= 8 input bits (p0,

Pre-reduction instance sizes The pre-reduction system contains: •Boolean variables: 2d+d 2 + 2C(d) = 8 + 16 + 144 = 168. These comprise the 2d= 8 input bits (p0, . . . , p3, q0, . . . , q3),d 2 = 16 partial products, and 2×72 = 144 auxiliary variables (one sum and one carry per contraction). •AND clauses:d 2 +C(d) = 16 + 72 = 88 (16 from partial products,...
[20]

We describe the key deductions grouped by the column structure that generates them

Preprocessing The iterative Boolean reduction proceeds as follows. We describe the key deductions grouped by the column structure that generates them. a. Column 0: extracting the LSBs.The sole entry in column 0 isa 00 =p 0 ∧q 0, pinned toN 0 = 1. Since an AND output of 1 forces both inputs to 1, we immediately deduce p0 = 1, q 0 = 1. This collapses all pa...
[21]

(5)], and the 12 residual XOR clauses each produce 4 CNF clauses [Eq

CNF output The 20 residual AND clauses each produce 3 CNF clauses [Eq. (5)], and the 12 residual XOR clauses each produce 4 CNF clauses [Eq. (6)], giving a raw count of 3×20 + 4×12 = 108 clauses. After removing tautolo- gies and subsumed clauses during the CNF cleanup pass, 102 clauses survive on 29 variables
[22]

Each AND gadget [Eq

Ising compilation The Ising model is assembled from the 20 residual AND clauses and 12 residual XOR clauses. Each AND gadget [Eq. (20)] contributes 3 couplings with no auxiliary spins; each XOR gadget [Eq. (22)] contributes 6 couplings and introduces one auxiliary spin. The resulting Hamiltonian has 29 + 12 = 41 spins (29 physical, 12 XOR auxiliaries) bef...
[23]

11 TABLE V

Summary of instance dimensions Table V collects all key quantities for this example, comparing the pre-reduction formula predictions with the actual values. 11 TABLE V. Instance dimensions forN= 143 = 11×13 (d= 4). Pre-reduction values match the exact formulas of Table I. Quantity Pre-reduction Post-reduction Boolean variables 168 29 (free) Variables pinn...