pith. sign in

arxiv: 2605.21335 · v1 · pith:G7YM22CAnew · submitted 2026-05-20 · 💻 cs.LO

A Two-Watched Literal Scheme for First-Order Logic

Pith reviewed 2026-05-21 03:20 UTC · model grok-4.3

classification 💻 cs.LO
keywords first-order logictwo-watched literalsclause propagationCDCLsatisfiabilityground literalsfirst-order clausesconflict detection
0
0 comments X

The pith

The two-watched literal scheme extends to first-order logic clauses to efficiently detect propagations and conflicts from ground literals.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

This paper adapts the two-watched literal technique, familiar from fast SAT solvers, to first-order logic. The adaptation lets a solver spot which clauses become true or false when some ground facts are known, without checking every literal in every clause. A sympathetic reader would care because first-order reasoning often deals with large clause sets where naive scanning is too slow. The authors give the method as a set of rules that they prove sound and complete, and they show an implementation that beats a standard dynamic-programming check especially on long clauses.

Core claim

Given a set of first-order clauses and a set of ground literals, the lifted two-watched literal scheme efficiently detects all propagating and false clauses with respect to the ground literals. The algorithm is presented as a system of rules and proved sound and complete. An implementation of the scheme outperforms a standard dynamic programming approach for detecting propagatable literals and conflicts, especially when dealing with long clauses.

What carries the argument

The lifted two-watched literal scheme, which maintains two watched literals per clause to track possible propagations and conflicts without full clause scans each time a ground literal arrives.

If this is right

  • The scheme integrates into first-order CDCL solvers to speed up propagation.
  • It correctly identifies all clauses that propagate a literal or become false under the given ground facts.
  • Performance gains appear most clearly when clauses contain many literals.
  • Incremental updates to the watch structure avoid revisiting the whole clause set on each new literal.

Where Pith is reading between the lines

These are editorial extensions of the paper, not claims the author makes directly.

  • If adopted, first-order theorem provers could handle larger problems in verification and AI planning without proportional slowdowns in propagation.
  • The rule-based presentation may allow easy combination with other first-order reasoning techniques such as unification.
  • Testing the scheme on standard first-order benchmark sets would show whether the efficiency gains hold beyond the authors' test cases.

Load-bearing premise

The scheme assumes that ground literals arrive as input and that the two-watched structure for each clause can be maintained and updated incrementally without rechecking the entire set of clauses every time.

What would settle it

A collection of first-order clauses together with ground literals for which the scheme fails to report a clause that should propagate a literal, or reports a propagation that does not actually follow, would show the rules are neither complete nor sound.

Figures

Figures reproduced from arXiv: 2605.21335 by Christoph Weidenbach, Lorenz Leutgeb, Martin Bromberger, Simon Schwarz, Tobias Gehl, Yasmine Briefs.

Figure 1
Figure 1. Figure 1: Runtime comparison on all eligible TPTP v9.2.1 problems. 10 5 10 4 10 3 10 2 10 1 10 0 10 1 10 2 Runtime of DP (s, log-scaled) 10 5 10 4 10 3 10 2 10 1 10 0 10 1 10 2 Runtime of TWFO (s, log-scaled) Total Runtime of DP and TWFO, problems with median clause size > 2 (N=820) [PITH_FULL_IMAGE:figures/full_fig_p025_1.png] view at source ↗
Figure 4
Figure 4. Figure 4: Number of considered in￾stances on all eligible problems [PITH_FULL_IMAGE:figures/full_fig_p025_4.png] view at source ↗
read the original abstract

The two-watched literal scheme, a core component of efficient CDCL (Conflict-Driven Clause Learning) implementations for propositional logic, is extended to first-order logic. Given a set of first-order clauses and a set of ground literals, our lifted two-watched literal scheme efficiently detects all propagating and false clauses with respect to the ground literals. We present the algorithm as a system of rules and prove its soundness and completeness. Additionally, we provide an implementation of the two-watched literal scheme, which outperforms a standard dynamic programming approach for detecting propagatable literals and conflicts, especially when dealing with long clauses.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Referee Report

2 major / 3 minor

Summary. The paper extends the two-watched literal scheme from propositional CDCL to first-order logic. Given a set of first-order clauses and ground literals, it defines a system of rules that detects all propagating and false clauses, proves the soundness and completeness of the rule system, and reports an implementation that outperforms dynamic programming on long clauses.

Significance. If the soundness and completeness proofs hold, the work offers a practical optimization for incremental clause watching in first-order settings, avoiding full rescans on each new ground literal. The empirical comparison with dynamic programming is a concrete strength, indicating potential utility in theorem provers or SMT solvers handling large clause sets.

major comments (2)
  1. [§3] §3 (Rule definitions): the completeness argument for detecting all false clauses under multiple possible unifiers is not fully load-bearing without an explicit case analysis showing that the watching rules do not miss ground instances when a clause has more than one matching literal.
  2. [§4] §4 (Soundness proof): the induction on rule applications assumes incremental maintenance of watches; the base case for the initial assignment of watches to non-ground clauses should be stated explicitly to confirm no initial propagations are overlooked.
minor comments (3)
  1. [Experiments] The experimental section should report the exact clause lengths and number of ground literals used in the dynamic-programming comparison to make the 'outperforms especially for long clauses' claim reproducible.
  2. [Preliminaries] Notation for watched literals mixes propositional and first-order terminology; a short table distinguishing the lifted concepts would improve clarity.
  3. [§3] A few minor typos appear in the rule schemas (e.g., missing subscript on the unification variable in rule R3).

Simulated Author's Rebuttal

2 responses · 0 unresolved

We thank the referee for the positive evaluation and the constructive comments on the rule definitions and soundness proof. We address each major comment below and will incorporate clarifications in the revised manuscript.

read point-by-point responses
  1. Referee: [§3] §3 (Rule definitions): the completeness argument for detecting all false clauses under multiple possible unifiers is not fully load-bearing without an explicit case analysis showing that the watching rules do not miss ground instances when a clause has more than one matching literal.

    Authors: We appreciate this observation on strengthening the completeness argument. The proof in §3 establishes that the watching rules detect all false clauses by maintaining the invariant that each clause has two watches and that propagations occur only on valid unifiers. For clauses with multiple matching literals under different unifiers, the rules ensure that if a ground instance falsifies all but one literal, a watch will trigger detection or propagation. To make this fully explicit as requested, we will add a dedicated case analysis paragraph in the revised §3, enumerating scenarios with multiple unifiers and confirming no ground instances are missed by the watching scheme. revision: yes

  2. Referee: [§4] §4 (Soundness proof): the induction on rule applications assumes incremental maintenance of watches; the base case for the initial assignment of watches to non-ground clauses should be stated explicitly to confirm no initial propagations are overlooked.

    Authors: We thank the referee for highlighting the need for an explicit base case. The soundness proof proceeds by induction on the number of rule applications, with the initial configuration defined by the watch assignment algorithm applied to the input clauses and ground literals. This initial assignment ensures that any propagatable literals from the starting ground literals are immediately detected by the rules, with no overlooked propagations. In the revised version, we will explicitly state and justify this base case at the beginning of the induction in §4, including a short argument that the non-ground clauses receive watches consistent with the two-watched literal scheme before any rules are applied. revision: yes

Circularity Check

0 steps flagged

No significant circularity; derivation self-contained via direct rule definitions and proofs

full rationale

The paper introduces a lifted two-watched literal scheme for first-order logic as an explicit system of rules, then states that soundness and completeness are proved for detecting propagating and false clauses given first-order clauses and ground literals. An implementation is supplied and benchmarked against dynamic programming, supplying an external performance check. No load-bearing step reduces by definition or self-citation to the target result itself; the rules are presented as new and their properties are claimed to be established independently rather than fitted or renamed from prior outputs. The efficiency claim rests on the reported implementation comparison, not on any internal tautology.

Axiom & Free-Parameter Ledger

0 free parameters · 0 axioms · 0 invented entities

The central claim rests on the existence of a sound and complete rule system for watching first-order clauses; the abstract does not list explicit free parameters or new invented entities.

pith-pipeline@v0.9.0 · 5634 in / 1054 out tokens · 15443 ms · 2026-05-21T03:20:06.976661+00:00 · methodology

discussion (0)

Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.

Lean theorems connected to this paper

Citations machine-checked in the Pith Canon. Every link opens the source theorem in the public Lean library.

What do these tags mean?
matches
The paper's claim is directly supported by a theorem in the formal canon.
supports
The theorem supports part of the paper's argument, but the paper may add assumptions or extra steps.
extends
The paper goes beyond the formal theorem; the theorem is a base layer rather than the whole result.
uses
The paper appears to rely on the theorem as machinery.
contradicts
The paper's claim conflicts with a theorem or certificate in the canon.
unclear
Pith found a possible connection, but the passage is too broad, indirect, or ambiguous to say the theorem truly supports the claim.

Reference graph

Works this paper leans on

13 extracted references · 13 canonical work pages

  1. [1]

    In: Auto- mated Reasoning – 13th International Joint Conference, IJCAR 2026, Lisbon, Portugal, July 26–29, 2026, Proceedings (2026), to appear 27

    Briefs, Y., Bromberger, M., Gehl, T., Leutgeb, L., Schwarz, S., Weiden- bach, C.: A two-watched literal scheme for first-order logic. In: Auto- mated Reasoning – 13th International Joint Conference, IJCAR 2026, Lisbon, Portugal, July 26–29, 2026, Proceedings (2026), to appear 27

  2. [2]

    https://doi.org/10.5281/zenodo.20138614

    Briefs, Y., Bromberger, M., Gehl, T., Leutgeb, L., Schwarz, S., Weiden- bach, C.: A two-watched literal scheme for first-order logic: Implemen- tation artifact (2026). https://doi.org/10.5281/zenodo.20138614

  3. [3]

    In: Pientka, B., Tinelli, C

    Bromberger, M., Desharnais, M., Weidenbach, C.: An Isabelle/HOL formalization of the SCL(FOL) calculus. In: Pientka, B., Tinelli, C. (eds.) Automated Deduction - CADE 29 - 29th International Conference on Automated Deduction, Rome, Italy, July 1-4, 2023, Proceedings. Lecture Notes in Computer Science, vol. 14132, pp. 116–133. Springer (2023). https://doi....

  4. [4]

    In: Konev, B., Schon, C., Steen, A

    Bromberger, M., Gehl, T., Leutgeb, L., Weidenbach, C.: A two-watched literal scheme for first-order logic. In: Konev, B., Schon, C., Steen, A. (eds.) Proceedings of the Workshop on Practical Aspects of Automated Reasoning Co-located with the 11th International Joint Conference on Automated Reasoning (FLoC/IJCAR 2022), Haifa, Israel, August, 11 - 12, 2022....

  5. [5]

    CoRRabs/2302.05954(2023)

    Bromberger, M., Schwarz, S., Weidenbach, C.: SCL(FOL) re- visited. CoRRabs/2302.05954(2023). https://doi.org/10.48550/ ARXIV.2302.05954

  6. [6]

    McCune, W.: Experiments with discrimination-tree indexing and path indexing for term retrieval. J. Autom. Reason.9(2), 147–167 (1992). https://doi.org/10.1007/BF00245458

  7. [7]

    226–231 (2001)

    Moskewicz, M.W., Madigan, C.F., Zhao, Y., Zhang, L., Malik, S.: Chaff: Engineering an efficient SAT solver. In: Proceedings of the 38th Design Automation Conference, DAC 2001, Las Vegas, NV, USA, June 18-22, 2001.pp.530–535.ACM(2001).https://doi.org/10.1145/378239. 379017

  8. [8]

    In: Goré, R., Leitsch, A., Nipkow, T

    Nieuwenhuis, R., Hillenbrand, T., Riazanov, A., Voronkov, A.: On the evaluation of indexing techniques for theorem proving. In: Goré, R., Leitsch, A., Nipkow, T. (eds.) Automated Reasoning, First In- ternational Joint Conference, IJCAR 2001, Siena, Italy, June 18-23, 2001, Proceedings. LNCS, vol. 2083, pp. 257–271. Springer (2001). https://doi.org/10.1007...

  9. [9]

    (eds.): Handbook of Automated Reason- ing (in 2 volumes)

    Robinson, J.A., Voronkov, A. (eds.): Handbook of Automated Reason- ing (in 2 volumes). Elsevier and MIT Press (2001) 28

  10. [10]

    In: Rutenbar, R.A., Otten, R.H.J.M

    Silva, J.P.M., Sakallah, K.A.: GRASP - a new search algorithm for satisfiability. In: Rutenbar, R.A., Otten, R.H.J.M. (eds.) Proceedings of the 1996 IEEE/ACM International Conference on Computer-Aided Design, ICCAD 1996, San Jose, CA, USA, November 10-14, 1996. pp. 220–227. IEEE Computer Society / ACM (1996). https://doi.org/10. 1109/ICCAD.1996.569607

  11. [11]

    SAT2005(53), 1–2 (2005)

    Sorensson, N., Een, N.: Minisat v1.13 - a SAT solver with conflict-clause minimization. SAT2005(53), 1–2 (2005)

  12. [12]

    https://doi.org/10.1007/s10817-017-9407-7

    Sutcliffe, G.: The TPTP Problem Library and Associated Infrastruc- ture.FromCNFtoTH0, TPTPv6.4.0.JournalofAutomatedReasoning 59(4), 483–502 (2017). https://doi.org/10.1007/s10817-017-9407-7

  13. [13]

    Sutcliffe, G.: Proceedings of CASC-30 – the CADE-30 ATP system competition (2025), https://tptp.org/CASC/30/Proceedings.pdf 29