Automated Reencoding Meets Graph Theory

Benjamin Przybocki; Bernardo Subercaseaux; Marijn J. H. Heule

arxiv: 2603.27774 · v2 · submitted 2026-03-29 · 💻 cs.CC · cs.LO

Automated Reencoding Meets Graph Theory

Benjamin Przybocki , Bernardo Subercaseaux , Marijn J. H. Heule This is my paper

Pith reviewed 2026-05-14 21:37 UTC · model grok-4.3

classification 💻 cs.CC cs.LO

keywords Bounded Variable Addition2-CNF formulasSAT preprocessinggraph-theoretic characterizationreencodingat-most-one constraintsclause minimization

0 comments

The pith

Idealized BVA reencodes any 2-CNF formula into one with (lg(3)/4 + o(1)) n² / lg n clauses.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

The paper gives a graph-theoretic model that exactly describes which 2-CNF encodings an idealized Bounded Variable Addition procedure can build. From this model the authors derive that idealized BVA plus minor steps such as equivalent literal substitution can rewrite any 2-CNF on n variables as an equivalent 2-CNF whose clause count is asymptotically (lg(3)/4) n² over lg n. The result matters because every clause removed from a SAT instance directly reduces the work done by modern solvers on combinatorial search problems. The same model also yields tight lower bounds for the at-most-one constraint and shows how the graph view produces a faster practical BVA implementation.

Core claim

The encodings that idealized BVA can produce correspond to specific graphs; this correspondence lets the authors prove that any 2-CNF can be reencoded to (lg(3)/4 + o(1)) n² / lg n clauses when equivalent literal substitution is allowed, that the constant rises to 1 without that step, that no reencoding method can beat 0.25, and that the at-most-one constraint on n variables requires at least 3n-6 clauses under BVA, a bound achieved by real implementations but not by the product encoding.

What carries the argument

Graph-theoretic characterization of the 2-CNF encodings producible by idealized BVA, which models clause sets as graphs whose structure determines what BVA can construct.

If this is right

Any 2-CNF on n variables can be rewritten as an equivalent 2-CNF whose size is (lg(3)/4 + o(1)) n² / lg n.
Without equivalent literal substitution the best constant BVA achieves is 1 rather than lg(3)/4.
No reencoding procedure, BVA or otherwise, can reach a constant factor below 0.25 for general 2-CNF formulas.
The at-most-one constraint on n variables requires at least 3n-6 clauses under idealized BVA, and this bound is tight for actual implementations.
The product encoding of at-most-one, which uses 2n + o(n) clauses, lies outside the set of encodings BVA can produce.

Where Pith is reading between the lines

These are editorial extensions of the paper, not claims the author makes directly.

The graph view of BVA may be extended to characterize other preprocessing routines such as variable elimination or clause subsumption.
The faster implementation derived from the graph model could be dropped into existing SAT solvers to improve preprocessing speed on large instances.
Similar graph reductions might be discovered for non-2-CNF constraints or for encodings that mix BVA with other techniques.
The 0.25 lower bound suggests a fundamental limit on how much any local reencoding method can compress 2-CNF formulas.

Load-bearing premise

The idealized BVA algorithm produces exactly the encodings predicted by the graph model; real heuristics could deviate and invalidate the bounds.

What would settle it

Take a large random monotone 2-CNF on n variables, apply idealized BVA plus equivalent literal substitution, and count the resulting clauses; if the count stays above (lg(3)/4 + o(1)) n² / lg n for arbitrarily large n, the reencoding claim is false.

Figures

Figures reproduced from arXiv: 2603.27774 by Benjamin Przybocki, Bernardo Subercaseaux, Marijn J. H. Heule.

**Figure 1.** Figure 1: Illustration of Example 1 More generally, let C and D be sets of clauses. Then, using notation C ▷◁ D to denote the set of clauses of the form (Ci ∨ Dj ) for Ci ∈ C and Dj ∈ D, we define a BVA step as follows: Definition 2 (BVA Step). Let φ be a formula, and let C and D be sets of nonempty clauses such that C ▷◁ D ⊆ φ, and y ̸∈ Var(φ) be a fresh variable. Then, we say that the formula φ ′ := (φ \ (C ▷◁ D))… view at source ↗

**Figure 2.** Figure 2: Illustration of a diagram and a polarized diagram for a simple 2-CNF. Edges with [PITH_FULL_IMAGE:figures/full_fig_p006_2.png] view at source ↗

**Figure 3.** Figure 3: Illustration of the BVA steps in Example 2 Remark 1. Let φ be a simple 2-CNF formula, and let (G, p) be a polarized diagram of Gφ. Then, φ = FR(G,p) . It turns out that polarized rectifier networks represent a more general class of simple 2-CNF encodings than what can be achieved by idealized BVA. To that end, we define the following: Definition 9 (Strict Polarized Rectifier Network). Let R = (B, A, E, p) … view at source ↗

**Figure 4.** Figure 4: SPRN realizing Kn with 3n − 6 edges Using Theorem 5, we prove that 3n−6 clauses is the smallest possible encoding for this constraint achievable by idealized BVA: Theorem 8 (Formal version of Theorem 4). For every φ with AtMostOne(x1, . . . , xn)∼∼▷ BVA ∗φ, we have |φ| ≥ 3n − 6. Note that there are 2-CNF encodings for AtMostOne(x1, . . . , xn) using 2n + o(n) clauses [7], and it was previously unknown whet… view at source ↗

**Figure 5.** Figure 5: Illustration for the proof of Theorem 8 Lemma 7. Let R be an SPRN realizing Kn with m edges for some n ≥ 4, and let x1 and x2 be base vertices. If the total degrees of x1 and x2 in R are both 2 and there is an edge between x1 and x2, then there is an SPRN realizing Kn−1 with at most m − 3 edges. Proof. Let R′ be obtained from R by deleting x1 and its two incident edges; also if R′ contains any auxiliary ve… view at source ↗

**Figure 6.** Figure 6: Comparison of reencoding methods on random monotone formulas [PITH_FULL_IMAGE:figures/full_fig_p015_6.png] view at source ↗

**Figure 7.** Figure 7: End-to-end runtime comparison of reencoding methods on random monotone formulas [PITH_FULL_IMAGE:figures/full_fig_p015_7.png] view at source ↗

read the original abstract

Bounded Variable Addition (BVA) is a central preprocessing method in modern state-of-the-art SAT solvers. We provide a graph-theoretic characterization of which 2-CNF encodings can be constructed by an idealized BVA algorithm. Based on this insight, we prove new results about the behavior and limitations of BVA and its interaction with other preprocessing techniques. We show that idealized BVA, plus some minor additional preprocessing (e.g., equivalent literal substitution), can reencode any 2-CNF formula with $n$ variables into an equivalent 2-CNF formula with $(\tfrac{\lg(3)}{4}+o(1))\,\tfrac{n^2}{\lg n}$ clauses. Furthermore, we show that without the additional preprocessing the constant factor worsens from $\tfrac{\lg(3)}{4} \approx 0.396$ to $1$, and that no reencoding method can achieve a constant below $0.25$. On the other hand, for the at-most-one constraint on $n$ variables, we prove that idealized BVA cannot reencode this constraint using fewer than $3n-6$ clauses, a bound that we prove is achieved by actual implementations. In particular, this shows that the product encoding for at-most-one, which uses $2n+o(n)$ clauses, cannot be constructed by BVA regardless of the heuristics used. Finally, our graph-theoretic characterization of BVA allows us to leverage recent work in algorithmic graph theory to develop a drastically more efficient implementation of BVA that achieves a comparable clause reduction on random monotone 2-CNF formulas.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Desk Editor's Note private letter to a colleague

This paper models BVA via graphs to get tight reencoding bounds for 2-CNFs and a faster implementation.

read the letter

The punchline is that idealized BVA on 2-CNF formulas can be captured exactly by a graph model, which yields new upper and lower bounds on the size of the output encoding plus a practical speed-up. They show that with a bit of extra preprocessing like equivalent literal substitution you reach (lg(3)/4 + o(1)) n² / lg n clauses, that nothing can beat a 0.25 constant factor, and that at-most-one constraints bottom out at 3n-6 clauses, a bound real BVA implementations already hit. The product encoding is ruled out for BVA regardless of heuristics. The characterization itself looks absent from the prior work they cite, and they turn the graph view into a faster algorithm that still delivers comparable clause reduction on random monotone 2-CNFs. The bounds come from the external graph model rather than parameter fitting, which keeps them non-circular. The 3n-6 tightness check against actual code is a solid concrete anchor. The main soft spot is that everything is stated for the idealized algorithm, so deployed heuristics could deviate in small ways that affect the constants in practice, though the authors flag this. The full proofs would need a close read for any hidden assumptions in the graph construction, but the abstract states the theorems with explicit constants and the stress-test found no obvious gaps. This is for people who build or analyze SAT preprocessors and encodings. A solver engineer or encoding researcher would get direct value from the bounds and the implementation trick. It deserves a serious referee because the new characterization and the concrete, checkable results on a core technique are worth referee time even if some polishing is needed.

Referee Report

0 major / 2 minor

Summary. The manuscript provides a graph-theoretic characterization of the 2-CNF encodings producible by an idealized Bounded Variable Addition (BVA) algorithm. Using this model, the authors prove that idealized BVA combined with minor additional preprocessing (such as equivalent literal substitution) can reencode any 2-CNF formula on n variables into an equivalent 2-CNF with at most (lg(3)/4 + o(1)) n² / lg n clauses. They establish that the constant worsens to 1 without the extra preprocessing, that no reencoding technique can achieve a constant below 0.25, and that idealized BVA requires at least 3n-6 clauses for the at-most-one constraint on n variables—a bound shown to be tight for actual BVA implementations, implying the product encoding (2n + o(n) clauses) cannot be obtained via BVA. The characterization is further used to derive a more efficient BVA implementation that performs well on random monotone 2-CNF formulas.

Significance. If the central claims hold, the work makes a substantive contribution by connecting SAT preprocessing to algorithmic graph theory, yielding explicit constants, matching upper and lower bounds, and a concrete tight result (3n-6) for at-most-one that is verified on deployed BVA. The development of a drastically faster implementation based on the graph model is a clear practical strength, as is the absence of free parameters or fitted quantities in the derivations. These results clarify the limitations of BVA and should inform both solver design and future theoretical analyses of preprocessing.

minor comments (2)

[Abstract] In the abstract and throughout, explicitly state that lg denotes the base-2 logarithm to avoid ambiguity for readers outside the SAT community.
[Implementation] The description of the new graph-theoretic BVA implementation would benefit from a short pseudocode snippet or high-level algorithm outline to make the efficiency improvement more transparent.

Simulated Author's Rebuttal

0 responses · 0 unresolved

We thank the referee for the positive summary, the recognition of the graph-theoretic characterization of BVA, the explicit constants and matching bounds, the tight 3n-6 result for at-most-one, and the practical faster implementation. We appreciate the recommendation to accept.

Circularity Check

0 steps flagged

No significant circularity identified

full rationale

The paper provides an explicit graph-theoretic characterization of encodings producible by idealized BVA, then derives the (lg(3)/4 + o(1)) n² / lg n upper bound, the worsened constant without extra preprocessing, the 0.25 lower bound, and the tight 3n-6 at-most-one result directly from that model plus standard graph-theoretic arguments. These steps are self-contained against external benchmarks and do not reduce to fitted parameters, self-definitions, or load-bearing self-citations. The concrete tightness result for actual BVA implementations is shown by matching the model to deployed behavior rather than by renaming or circular prediction.

Axiom & Free-Parameter Ledger

0 free parameters · 1 axioms · 0 invented entities

The central claims rest on a graph model of BVA that is introduced in the paper; the model itself is treated as an axiom for the subsequent proofs. No free parameters or invented physical entities are visible in the abstract.

axioms (1)

domain assumption The idealized BVA algorithm produces exactly the 2-CNF formulas whose incidence structure satisfies the stated graph-theoretic property.
This modeling assumption is the foundation for all subsequent bounds; it is introduced to enable the graph-theoretic analysis.

pith-pipeline@v0.9.0 · 5593 in / 1362 out tokens · 40153 ms · 2026-05-14T21:37:42.933873+00:00 · methodology

discussion (0)

Lean theorems connected to this paper

Citations machine-checked in the Pith Canon. Every link opens the source theorem in the public Lean library.

IndisputableMonolith/Foundation/AbsoluteFloorClosure.lean; IndisputableMonolith/Foundation/AlexanderDuality.lean reality_from_one_distinction; alexander_duality_circle_linking unclear

?

unclear
Relation between the paper passage and the cited Recognition theorem.

We provide a graph-theoretic characterization of which 2-CNF encodings can be constructed by an idealized BVA algorithm... correspondence between reencodings achievable by BVA and what we call strict polarized rectifier networks (Theorem 5).
IndisputableMonolith/Cost/FunctionalEquation.lean washburn_uniqueness_aczel; J_uniquely_calibrated_via_higher_derivative unclear

?

unclear
Relation between the paper passage and the cited Recognition theorem.

idealized BVA... can reencode any 2-CNF formula with n variables into an equivalent 2-CNF formula with (lg(3)/4 + o(1)) n² / lg n clauses

What do these tags mean?

matches: The paper's claim is directly supported by a theorem in the formal canon.
supports: The theorem supports part of the paper's argument, but the paper may add assumptions or extra steps.
extends: The paper goes beyond the formal theorem; the theorem is a base layer rather than the whole result.
uses: The paper appears to rely on the theorem as machinery.
contradicts: The paper's claim conflicts with a theorem or certificate in the canon.
unclear: Pith found a possible connection, but the passage is too broad, indirect, or ambiguous to say the theorem truly supports the claim.

Reference graph

Works this paper leans on

34 extracted references · 34 canonical work pages

[1]

Ab´ ıo, R

I. Ab´ ıo, R. Nieuwenhuis, A. Oliveras, and E. Rodr´ ıguez-Carbonell. A Parametric Approach for Smaller and Better Encodings of Cardinality Constraints. In C. Schulte, editor,Principles and Practice of Constraint Programming, pages 80–96, Berlin, Heidelberg, 2013. Springer Berlin Heidelberg. 4 15

work page 2013
[2]

P. Allen. Almost every 2-SAT function is unate.Israel J. Math., 161:311–346, 2007. doi: 10.1007/s11856-007-0081-z. 6, 11

work page doi:10.1007/s11856-007-0081-z 2007
[3]

Balogh, D

J. Balogh, D. Dong, B. Lidick´ y, N. Mani, and Y. Zhao. Nearly all k-SAT functions are unate. InProceedings of the 55th Annual ACM Symposium on Theory of Computing, STOC 2023, page 958–962, New York, NY, USA, 2023. Association for Computing Machinery. doi: 10.1145/3564246.3585123. 14

work page doi:10.1145/3564246.3585123 2023
[4]

Balyo, M

T. Balyo, M. Heule, M. Iser, M. J¨ arvisalo, and M. Suda, editors.Proceedings of SAT Competition 2023: Solver, Benchmark and Proof Checker Descriptions. Department of Computer Science Series of Publications B. Department of Computer Science, University of Helsinki, Finland,

work page 2023
[5]

A. Biere. exactly one clauses·Issue #39·arminbiere/kissat — github.com. https:// github.com/arminbiere/kissat/issues/39#issuecomment-1686043817, 2023. [Accessed 06-06-2025]. 1

work page 2023
[6]

A. Biere. cadical/src/factor.cpp at master·arminbiere/cadical — github.com. https://github. com/arminbiere/cadical/blob/master/src/factor.cpp, 2026. [Accessed 06-02-2026]. 1

work page 2026
[7]

J. Chen. A new SAT encoding of the at-most-one constraint.Proc. of the Tenth Int. Workshop of Constraint Modelling and Reformulation., page 8, 2010. 3, 12

work page 2010
[8]

Chistikov, S

D. Chistikov, S. Iv´ an, A. Lubiw, and J. Shallit. Fractional Coverings, Greedy Coverings, and Rectifier Networks. In H. Vollmer and B. Vall´ ee, editors,34th Symposium on Theoretical Aspects of Computer Science (STACS 2017), volume 66 ofLeibniz International Proceedings in Informatics (LIPIcs), pages 23:1–23:14, Dagstuhl, Germany, 2017. Schloss Dagstuhl ...

work page doi:10.4230/lipics.stacs.2017.23 2017
[9]

Ganesh and M

V. Ganesh and M. Y. Vardi.On the Unreasonable Effectiveness of SAT Solvers, page 547–566. Cambridge University Press, 2021. 1

work page 2021
[10]

Haberlandt, H

A. Haberlandt, H. Green, and M. J. H. Heule. Effective Auxiliary Variables via Structured Reencoding. In M. Mahajan and F. Slivovsky, editors,26th International Conference on Theory and Applications of Satisfiability Testing (SAT 2023), volume 271 ofLeibniz International Proceedings in Informatics (LIPIcs), pages 11:1–11:19, Dagstuhl, Germany, 2023. Schlo...

work page doi:10.4230/lipics.sat.2023.11 2023
[11]

M. J. H. Heule, M. J¨ arvisalo, and A. Biere. Efficient CNF Simplification Based on Binary Impli- cation Graphs. In K. A. Sakallah and L. Simon, editors,Theory and Applications of Satisfiability Testing - SAT 2011, pages 201–215. Springer, 2011. doi:10.1007/978-3-642-21581-0_17 . 2

work page doi:10.1007/978-3-642-21581-0_17 2011
[12]

M. J. H. Heule and M. Scheucher. Happy Ending: An Empty Hexagon in Every Set of 30 Points. In B. Finkbeiner and L. Kov´ acs, editors,Tools and Algorithms for the Construction and Analysis of Systems, pages 61–80, Cham, 2024. Springer Nature Switzerland. 4

work page 2024
[13]

PySAT: A Python toolkit for prototyping with SAT oracles

A. Ignatiev, A. Morgado, and J. Marques-Silva. PySAT: A Python toolkit for prototyping with SAT oracles. InSAT, pages 428–437, 2018.doi:10.1007/978-3-319-94144-8_26. 28

work page doi:10.1007/978-3-319-94144-8_26 2018
[14]

Iv´ an,´A

S. Iv´ an,´A. D. Lelkes, J. Nagy-Gy¨ orgy, B. Sz¨ or´ enyi, and G. Tur´ an. Biclique Coverings, Rectifier Networks and the Cost of ε-Removal. In H. J¨ urgensen, J. Karhum¨ aki, and A. Okhotin, editors, 16 Descriptional Complexity of Formal Systems, pages 174–185, Cham, 2014. Springer International Publishing. 2, 5

work page 2014
[15]

S. Jukna. Computational Complexity of Graphs. InAdvances in Network Complexity, chapter 5, pages 99–153. John Wiley & Sons, Ltd, 2013.doi:10.1002/9783527670468.ch05. 5

work page doi:10.1002/9783527670468.ch05 2013
[16]

Jukna and I

S. Jukna and I. Sergeev. Complexity of linear boolean operators.Found. Trends Theor. Comput. Sci., 9(1):1–123, 2013.doi:10.1561/0400000063. 8

work page doi:10.1561/0400000063 2013
[17]

Kirchweger, T

M. Kirchweger, T. Peitl, B. Subercaseaux, and S. Szeider. From the finite to the infinite: Sharper asymptotic bounds on Norin’s conjecture via SAT, 2025. URL: https://arxiv.org/ abs/2511.08386,arXiv:2511.08386. 4

work page arXiv 2025
[18]

Krapivin, B

A. Krapivin, B. Przybocki, N. Sanhueza-Matamala, and B. Subercaseaux. Optimal and efficient partite decompositions of hypergraphs, 2025. URL: https://arxiv.org/abs/2511.11855, arXiv:2511.11855. 3, 9, 10, 14

work page arXiv 2025
[19]

O. Lupanov. On rectifier and switching-and-rectifier circuits.Doklady Academii nauk SSSR, 111, 1956. Available at https://web.vu.lt/mif/s.jukna/boolean/lupanov56.pdf, thanks to Stasys Jukna. 2, 5

work page 1956
[20]

Manthey, M

N. Manthey, M. J. H. Heule, and A. Biere. Automated reencoding of boolean formulas. In Haifa Verification Conference, pages 102–117. Springer, 2012. 1, 2, 3, 4, 5, 12, 14, 27

work page 2012
[21]

E. I. Nechiporuk. The topological principles of self-correction.Problemy Kibernet., (21):5– 102, 1969. Available in Russian at https://web.vu.lt/mif/s.jukna/boolean/Russians/ Nechiporuk-1969a.pdf, thanks to Stasys Jukna. 2, 9

work page 1969
[22]

L. Qian, E. Wang, B. Subercaseaux, and M. J. H. Heule. Unfolding boxes with local constraints. InAutomated Deduction – CADE 30: 30th International Conference on Automated Deduction, Stuttgart, Germany, July 28-31, 2025, Proceedings, page 736–754, Berlin, Heidelberg, 2025. Springer-Verlag.doi:10.1007/978-3-031-99984-0_38. 4

work page doi:10.1007/978-3-031-99984-0_38 2025
[23]

2024.arXiv:10138/584822

Proceedings of SAT Competition 2024 : Solver, Benchmark and Proof Checker Descriptions. 2024.arXiv:10138/584822. 1

work page 2024
[24]

Subercaseaux

B. Subercaseaux. Asymptotically smaller encodings for graph problems and scheduling, 2025. (ModRef workshop). URL:https://arxiv.org/abs/2506.14042,arXiv:2506.14042. 11

work page arXiv 2025
[25]

Subercaseaux and M

B. Subercaseaux and M. J. H. Heule. The Packing Chromatic Number of the Infinite Square Grid is 15. In S. Sankaranarayanan and N. Sharygina, editors,Tools and Algorithms for the Construction and Analysis of Systems - 29th International Conference, TACAS 2023, volume 13993 ofLecture Notes in Computer Science, pages 389–406. Springer, 2023. doi: 10.1007/978...

work page doi:10.1007/978-3-031-30823-9_20 2023
[26]

R. E. Tarjan. Depth-first search and linear graph algorithms.SIAM J. Comput., 1(2):146–160, 1972.doi:10.1137/0201010. 18 A Proofs from Section 3 Lemma 8.Given a satisfiable 2-CNF formula φ, we can compute a formula φ′ in linear time such that (a) φ′ is obtained from φ by replacing some variables by their negations wherever they appear, and (b) every claus...

work page doi:10.1137/0201010 1972
[27]

for everyx∈Xandy∈Y, we have(x, y)∈E; and

work page
[28]

Definition 11(Biclique Reduction).Let R = (B, A, E, p)be a PRN and( X, Y )a coherent biclique with edge set EX,Y

for everyy∈Y∩Bandx 1, x2 ∈X, we havep(x 1, y) =p(x 2, y). Definition 11(Biclique Reduction).Let R = (B, A, E, p)be a PRN and( X, Y )a coherent biclique with edge set EX,Y . Then, thebiclique reductionof R with respect to( X, Y )is a new PRN R′ = (B, A′, E′, p′), whereA ′ =A⊔ {z}for some new vertexzand E′ := (E\E X,Y )∪ {(x, z)|x∈X} ∪ {(z, y)|y∈Y}, p′(u, v...

work page
[29]

add the clause ( xi ∨ xj) toφ

work page
[30]

add the clause ( xi ∨x j) toφ; or

work page
[31]

There are 3(n

add neither clause toφ. There are 3(n

work page
[32]

Proposition 3.There is a monotone 2-CNF formula with n variables whose smallest 2-CNF encoding has at least( 1 4 −o(1)) n2 lgn clauses

= 3(1/2−o(1))n2 ways to carry out this construction, each of which yields a distinct simple 2-CNF formula onnvariables. Proposition 3.There is a monotone 2-CNF formula with n variables whose smallest 2-CNF encoding has at least( 1 4 −o(1)) n2 lgn clauses. Proof. Let g(m) be the number of 2-CNF formulas with at most m clauses, and let h(n) := 2(n

work page
[33]

By Lemma 2, lgg (m) ≤ (1 +o(1))m·lgm

be the number of monotone 2-CNF functions on n variables. By Lemma 2, lgg (m) ≤ (1 +o(1))m·lgm . If every monotone 2-CNF function on n variables can be encoded with at most m clauses, then g(m) ≥h (n), and consequently lgg (m) ≥lgh (n). If m≤ (r + o(1))n2/lgn for some constant r, then lgm≤(2 +o(1)) lgn, and thus (1/2−o(1))n 2 ≤lgh(n)≤lgg(m)≤(1 +o(1))m·lgm...

work page
[34]

Since randomness is involved, all our results, both for clauses, variables, and runtimes, are averaged over 5 independent trials

is well-concentrated around 2 lgn , we consider both a satisfiable regime, for which we set k := 1 + ⌊1.2 lgn⌋ , as well as an unsatisfiable regime for which we set k := ⌊30 lgn⌋ in order to keep the runtimes manageable (as k approaches the critical threshold, the instances become harder to solve). Since randomness is involved, all our results, both for c...

work page 1901

[1] [1]

Ab´ ıo, R

I. Ab´ ıo, R. Nieuwenhuis, A. Oliveras, and E. Rodr´ ıguez-Carbonell. A Parametric Approach for Smaller and Better Encodings of Cardinality Constraints. In C. Schulte, editor,Principles and Practice of Constraint Programming, pages 80–96, Berlin, Heidelberg, 2013. Springer Berlin Heidelberg. 4 15

work page 2013

[2] [2]

P. Allen. Almost every 2-SAT function is unate.Israel J. Math., 161:311–346, 2007. doi: 10.1007/s11856-007-0081-z. 6, 11

work page doi:10.1007/s11856-007-0081-z 2007

[3] [3]

Balogh, D

J. Balogh, D. Dong, B. Lidick´ y, N. Mani, and Y. Zhao. Nearly all k-SAT functions are unate. InProceedings of the 55th Annual ACM Symposium on Theory of Computing, STOC 2023, page 958–962, New York, NY, USA, 2023. Association for Computing Machinery. doi: 10.1145/3564246.3585123. 14

work page doi:10.1145/3564246.3585123 2023

[4] [4]

Balyo, M

T. Balyo, M. Heule, M. Iser, M. J¨ arvisalo, and M. Suda, editors.Proceedings of SAT Competition 2023: Solver, Benchmark and Proof Checker Descriptions. Department of Computer Science Series of Publications B. Department of Computer Science, University of Helsinki, Finland,

work page 2023

[5] [5]

A. Biere. exactly one clauses·Issue #39·arminbiere/kissat — github.com. https:// github.com/arminbiere/kissat/issues/39#issuecomment-1686043817, 2023. [Accessed 06-06-2025]. 1

work page 2023

[6] [6]

A. Biere. cadical/src/factor.cpp at master·arminbiere/cadical — github.com. https://github. com/arminbiere/cadical/blob/master/src/factor.cpp, 2026. [Accessed 06-02-2026]. 1

work page 2026

[7] [7]

J. Chen. A new SAT encoding of the at-most-one constraint.Proc. of the Tenth Int. Workshop of Constraint Modelling and Reformulation., page 8, 2010. 3, 12

work page 2010

[8] [8]

Chistikov, S

D. Chistikov, S. Iv´ an, A. Lubiw, and J. Shallit. Fractional Coverings, Greedy Coverings, and Rectifier Networks. In H. Vollmer and B. Vall´ ee, editors,34th Symposium on Theoretical Aspects of Computer Science (STACS 2017), volume 66 ofLeibniz International Proceedings in Informatics (LIPIcs), pages 23:1–23:14, Dagstuhl, Germany, 2017. Schloss Dagstuhl ...

work page doi:10.4230/lipics.stacs.2017.23 2017

[9] [9]

Ganesh and M

V. Ganesh and M. Y. Vardi.On the Unreasonable Effectiveness of SAT Solvers, page 547–566. Cambridge University Press, 2021. 1

work page 2021

[10] [10]

Haberlandt, H

A. Haberlandt, H. Green, and M. J. H. Heule. Effective Auxiliary Variables via Structured Reencoding. In M. Mahajan and F. Slivovsky, editors,26th International Conference on Theory and Applications of Satisfiability Testing (SAT 2023), volume 271 ofLeibniz International Proceedings in Informatics (LIPIcs), pages 11:1–11:19, Dagstuhl, Germany, 2023. Schlo...

work page doi:10.4230/lipics.sat.2023.11 2023

[11] [11]

M. J. H. Heule, M. J¨ arvisalo, and A. Biere. Efficient CNF Simplification Based on Binary Impli- cation Graphs. In K. A. Sakallah and L. Simon, editors,Theory and Applications of Satisfiability Testing - SAT 2011, pages 201–215. Springer, 2011. doi:10.1007/978-3-642-21581-0_17 . 2

work page doi:10.1007/978-3-642-21581-0_17 2011

[12] [12]

M. J. H. Heule and M. Scheucher. Happy Ending: An Empty Hexagon in Every Set of 30 Points. In B. Finkbeiner and L. Kov´ acs, editors,Tools and Algorithms for the Construction and Analysis of Systems, pages 61–80, Cham, 2024. Springer Nature Switzerland. 4

work page 2024

[13] [13]

PySAT: A Python toolkit for prototyping with SAT oracles

A. Ignatiev, A. Morgado, and J. Marques-Silva. PySAT: A Python toolkit for prototyping with SAT oracles. InSAT, pages 428–437, 2018.doi:10.1007/978-3-319-94144-8_26. 28

work page doi:10.1007/978-3-319-94144-8_26 2018

[14] [14]

Iv´ an,´A

S. Iv´ an,´A. D. Lelkes, J. Nagy-Gy¨ orgy, B. Sz¨ or´ enyi, and G. Tur´ an. Biclique Coverings, Rectifier Networks and the Cost of ε-Removal. In H. J¨ urgensen, J. Karhum¨ aki, and A. Okhotin, editors, 16 Descriptional Complexity of Formal Systems, pages 174–185, Cham, 2014. Springer International Publishing. 2, 5

work page 2014

[15] [15]

S. Jukna. Computational Complexity of Graphs. InAdvances in Network Complexity, chapter 5, pages 99–153. John Wiley & Sons, Ltd, 2013.doi:10.1002/9783527670468.ch05. 5

work page doi:10.1002/9783527670468.ch05 2013

[16] [16]

Jukna and I

S. Jukna and I. Sergeev. Complexity of linear boolean operators.Found. Trends Theor. Comput. Sci., 9(1):1–123, 2013.doi:10.1561/0400000063. 8

work page doi:10.1561/0400000063 2013

[17] [17]

Kirchweger, T

M. Kirchweger, T. Peitl, B. Subercaseaux, and S. Szeider. From the finite to the infinite: Sharper asymptotic bounds on Norin’s conjecture via SAT, 2025. URL: https://arxiv.org/ abs/2511.08386,arXiv:2511.08386. 4

work page arXiv 2025

[18] [18]

Krapivin, B

A. Krapivin, B. Przybocki, N. Sanhueza-Matamala, and B. Subercaseaux. Optimal and efficient partite decompositions of hypergraphs, 2025. URL: https://arxiv.org/abs/2511.11855, arXiv:2511.11855. 3, 9, 10, 14

work page arXiv 2025

[19] [19]

O. Lupanov. On rectifier and switching-and-rectifier circuits.Doklady Academii nauk SSSR, 111, 1956. Available at https://web.vu.lt/mif/s.jukna/boolean/lupanov56.pdf, thanks to Stasys Jukna. 2, 5

work page 1956

[20] [20]

Manthey, M

N. Manthey, M. J. H. Heule, and A. Biere. Automated reencoding of boolean formulas. In Haifa Verification Conference, pages 102–117. Springer, 2012. 1, 2, 3, 4, 5, 12, 14, 27

work page 2012

[21] [21]

E. I. Nechiporuk. The topological principles of self-correction.Problemy Kibernet., (21):5– 102, 1969. Available in Russian at https://web.vu.lt/mif/s.jukna/boolean/Russians/ Nechiporuk-1969a.pdf, thanks to Stasys Jukna. 2, 9

work page 1969

[22] [22]

L. Qian, E. Wang, B. Subercaseaux, and M. J. H. Heule. Unfolding boxes with local constraints. InAutomated Deduction – CADE 30: 30th International Conference on Automated Deduction, Stuttgart, Germany, July 28-31, 2025, Proceedings, page 736–754, Berlin, Heidelberg, 2025. Springer-Verlag.doi:10.1007/978-3-031-99984-0_38. 4

work page doi:10.1007/978-3-031-99984-0_38 2025

[23] [23]

2024.arXiv:10138/584822

Proceedings of SAT Competition 2024 : Solver, Benchmark and Proof Checker Descriptions. 2024.arXiv:10138/584822. 1

work page 2024

[24] [24]

Subercaseaux

B. Subercaseaux. Asymptotically smaller encodings for graph problems and scheduling, 2025. (ModRef workshop). URL:https://arxiv.org/abs/2506.14042,arXiv:2506.14042. 11

work page arXiv 2025

[25] [25]

Subercaseaux and M

B. Subercaseaux and M. J. H. Heule. The Packing Chromatic Number of the Infinite Square Grid is 15. In S. Sankaranarayanan and N. Sharygina, editors,Tools and Algorithms for the Construction and Analysis of Systems - 29th International Conference, TACAS 2023, volume 13993 ofLecture Notes in Computer Science, pages 389–406. Springer, 2023. doi: 10.1007/978...

work page doi:10.1007/978-3-031-30823-9_20 2023

[26] [26]

R. E. Tarjan. Depth-first search and linear graph algorithms.SIAM J. Comput., 1(2):146–160, 1972.doi:10.1137/0201010. 18 A Proofs from Section 3 Lemma 8.Given a satisfiable 2-CNF formula φ, we can compute a formula φ′ in linear time such that (a) φ′ is obtained from φ by replacing some variables by their negations wherever they appear, and (b) every claus...

work page doi:10.1137/0201010 1972

[27] [27]

for everyx∈Xandy∈Y, we have(x, y)∈E; and

work page

[28] [28]

Definition 11(Biclique Reduction).Let R = (B, A, E, p)be a PRN and( X, Y )a coherent biclique with edge set EX,Y

for everyy∈Y∩Bandx 1, x2 ∈X, we havep(x 1, y) =p(x 2, y). Definition 11(Biclique Reduction).Let R = (B, A, E, p)be a PRN and( X, Y )a coherent biclique with edge set EX,Y . Then, thebiclique reductionof R with respect to( X, Y )is a new PRN R′ = (B, A′, E′, p′), whereA ′ =A⊔ {z}for some new vertexzand E′ := (E\E X,Y )∪ {(x, z)|x∈X} ∪ {(z, y)|y∈Y}, p′(u, v...

work page

[29] [29]

add the clause ( xi ∨ xj) toφ

work page

[30] [30]

add the clause ( xi ∨x j) toφ; or

work page

[31] [31]

There are 3(n

add neither clause toφ. There are 3(n

work page

[32] [32]

Proposition 3.There is a monotone 2-CNF formula with n variables whose smallest 2-CNF encoding has at least( 1 4 −o(1)) n2 lgn clauses

= 3(1/2−o(1))n2 ways to carry out this construction, each of which yields a distinct simple 2-CNF formula onnvariables. Proposition 3.There is a monotone 2-CNF formula with n variables whose smallest 2-CNF encoding has at least( 1 4 −o(1)) n2 lgn clauses. Proof. Let g(m) be the number of 2-CNF formulas with at most m clauses, and let h(n) := 2(n

work page

[33] [33]

By Lemma 2, lgg (m) ≤ (1 +o(1))m·lgm

be the number of monotone 2-CNF functions on n variables. By Lemma 2, lgg (m) ≤ (1 +o(1))m·lgm . If every monotone 2-CNF function on n variables can be encoded with at most m clauses, then g(m) ≥h (n), and consequently lgg (m) ≥lgh (n). If m≤ (r + o(1))n2/lgn for some constant r, then lgm≤(2 +o(1)) lgn, and thus (1/2−o(1))n 2 ≤lgh(n)≤lgg(m)≤(1 +o(1))m·lgm...

work page

[34] [34]

Since randomness is involved, all our results, both for clauses, variables, and runtimes, are averaged over 5 independent trials

is well-concentrated around 2 lgn , we consider both a satisfiable regime, for which we set k := 1 + ⌊1.2 lgn⌋ , as well as an unsatisfiable regime for which we set k := ⌊30 lgn⌋ in order to keep the runtimes manageable (as k approaches the critical threshold, the instances become harder to solve). Since randomness is involved, all our results, both for c...

work page 1901