Conflict-Aware Fusion: Mitigating Logic Inertia in Large Language Models via Structured Cognitive Priors
Pith reviewed 2026-05-17 01:05 UTC · model grok-4.3
The pith
A four-stage training pipeline makes LLMs check rules for contradictions before reasoning, fixing their tendency to follow inconsistent premises.
A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.
Core claim
The paper claims that Logic Inertia arises because generative models persist along learned deductive paths even under inconsistent premises, and that Conflict-Aware Fusion mitigates it by enforcing verification-before-deduction as a structural prior through four stages: SFT for a verification preamble, DPO to sharpen the halt-on-contradiction boundary, Logical Invariance REgularisation via symmetric KL on equivalent rule forms, and Reinforcement Learning from Verification Feedback that uses a deterministic symbolic oracle for joint optimization of invariance and sensitivity. The pipeline saturates all four stress tests on 1.5B and 8B backbones and extends to a Lean 4 kernel that reaches 99.0
What carries the argument
Conflict-Aware Fusion, a four-stage pipeline that trains verification-before-deduction using SFT, DPO, LIRE for logical invariance, and RLVF with a symbolic forward-chaining reward oracle.
If this is right
- Untreated baselines drop from 1.00 to 0.00 accuracy on contradiction injection while trained models maintain full instance-level exact match.
- The method succeeds on both 1.5B and 8B parameter backbones across all four stress tests.
- Replacing the propositional oracle with a Lean 4 kernel yields 99.0 percent agreement on the 105 classically derivable questions in a 187-question sample.
- The approach supplies a concrete path toward formally verified reinforcement learning for language models.
Where Pith is reading between the lines
- The verification prior could transfer to multi-step planning tasks where rule sets evolve over time.
- Hybrid neuro-symbolic training of this kind may reduce error accumulation in long inference chains beyond the tested benchmarks.
- The Lean 4 extension points toward scalable integration with automated theorem provers for safety-critical applications.
- Real-world rule systems such as regulatory compliance or diagnostic protocols could serve as natural next testbeds.
Load-bearing premise
The four stress tests cover the main logical inconsistencies that appear in real LLM deployments and the symbolic verification engine supplies reward signals free of its own coverage biases.
What would settle it
Apply the trained models to a fresh collection of contradictory rule sets never seen in training and check whether accuracy stays near 1.0 or drops toward the untreated baseline of 0.0.
Figures
read the original abstract
Large language models (LLMs) achieve high accuracy on many reasoning benchmarks but remain brittle under structural perturbations of rule-based systems. We introduce a diagnostic framework with four stress tests -- redundant vs. essential rule deletion, contradictory-rule injection, logic-preserving rewrites, and multi-law stacking -- and use it to expose Logic Inertia: the tendency of generative LLMs (Qwen2/3, TinyLlama, GPT-4o, Gemma-3-4B-IT) and the encoder-only BERT baseline to persist along learned deductive trajectories under inconsistent premises. The collapse is sharp: untreated baselines fall from accuracy 1.00 on the base task to 0.00 on contradiction injection (instance-level exact match), and GPT-4o resolves only 56.0% of contradiction cases. We propose Conflict-Aware Fusion, a four-stage training pipeline that enforces verification-before-deduction as a learned structural prior: (i) SFT establishes the verification preamble; (ii) DPO sharpens the halt-on-contradiction decision boundary; (iii) Logical Invariance REgularisation (LIRE) penalises divergence between logically equivalent rule formulations via symmetric KL; (iv) Reinforcement Learning from Verification Feedback (RLVF) uses a symbolic forward-chaining engine as a deterministic oracle reward, jointly optimising invariance and sensitivity. The pipeline saturates all four primary stress tests for both 1.5B and 8B backbones. We further validate a Phase 2 extension that replaces the propositional oracle with a Lean 4 kernel, attaining 99.0% kernel agreement on the 105 classically-derivable (T) questions within a stratified 187-question Lean-translated sample (overall 71.7% across both polarities), providing a sound upgrade path to formally verified RL training. Code and benchmark: https://github.com/14H034160212/lemo
Editorial analysis
A structured set of objections, weighed in public.
Referee Report
Summary. The paper claims that LLMs exhibit 'Logic Inertia' by persisting along learned deductive trajectories under inconsistent premises, as demonstrated by sharp accuracy drops (to 0.00 on contradiction injection for baselines, 56% for GPT-4o) across four stress tests: redundant/essential rule deletion, contradictory-rule injection, logic-preserving rewrites, and multi-law stacking. It proposes Conflict-Aware Fusion, a four-stage pipeline (SFT for verification preamble, DPO for halt-on-contradiction, LIRE for symmetric KL invariance on equivalent formulations, and RLVF using a propositional forward-chaining engine as deterministic oracle) that saturates all four tests for 1.5B and 8B backbones. A Phase 2 extension replaces the oracle with a Lean 4 kernel, achieving 99.0% agreement on 105 classically-derivable (T) questions in a stratified 187-question sample (overall 71.7% across polarities).
Significance. If the results hold, this provides a concrete hybrid training recipe for embedding verification-before-deduction priors into LLMs, reducing brittleness on structural perturbations of rule systems. The external symbolic/Lean oracles lower circularity relative to purely self-supervised approaches, and the reproducible code plus benchmark release are positive for verification. The work could inform more reliable LLM deployment in domains requiring consistent rule application.
major comments (2)
- [Results] Results and experimental details: The central saturation claim for all four stress tests on 1.5B/8B models is reported without data-split descriptions, statistical significance tests, or ablation controls isolating each stage (SFT, DPO, LIRE, RLVF). This leaves open whether observed gains reflect the proposed pipeline or post-hoc pipeline choices, directly affecting support for the mitigation claim.
- [Method (RLVF stage)] RLVF and oracle description: The propositional forward-chaining engine is used as a deterministic reward source for contradiction penalization, yet the manuscript does not characterize its coverage for patterns requiring negation-as-failure or non-monotonic reasoning. If such patterns are silently resolved or ignored, high benchmark scores could be achieved without eliminating logic inertia on unseen inconsistency types; the Phase 2 Lean results (99% on T-questions only) do not test this gap because the sample is restricted to classically-derivable cases.
minor comments (2)
- [Introduction] The term 'Logic Inertia' is introduced without explicit comparison to prior notions of reasoning brittleness or hallucination in rule-based settings; a brief related-work paragraph would clarify novelty.
- [Method] Notation for LIRE (symmetric KL between equivalent formulations) could be formalized with an equation to make the invariance penalty precise.
Simulated Author's Rebuttal
We thank the referee for the constructive feedback on our manuscript. We address the major comments point by point below, indicating planned revisions where appropriate to strengthen the presentation of our results and methods.
read point-by-point responses
-
Referee: [Results] Results and experimental details: The central saturation claim for all four stress tests on 1.5B/8B models is reported without data-split descriptions, statistical significance tests, or ablation controls isolating each stage (SFT, DPO, LIRE, RLVF). This leaves open whether observed gains reflect the proposed pipeline or post-hoc pipeline choices, directly affecting support for the mitigation claim.
Authors: We agree that additional experimental details are needed to fully support the saturation claims. In the revised manuscript we will add: (i) explicit descriptions of the train/validation/test splits for each backbone and stress test, (ii) statistical significance results (bootstrap confidence intervals and paired tests across three random seeds), and (iii) a dedicated ablation table that reports performance after each successive stage (SFT, then +DPO, then +LIRE, then +RLVF). These additions will clarify that the gains arise from the cumulative pipeline rather than isolated post-hoc choices. revision: yes
-
Referee: [Method (RLVF stage)] RLVF and oracle description: The propositional forward-chaining engine is used as a deterministic reward source for contradiction penalization, yet the manuscript does not characterize its coverage for patterns requiring negation-as-failure or non-monotonic reasoning. If such patterns are silently resolved or ignored, high benchmark scores could be achieved without eliminating logic inertia on unseen inconsistency types; the Phase 2 Lean results (99% on T-questions only) do not test this gap because the sample is restricted to classically-derivable cases.
Authors: The propositional forward-chaining oracle is intentionally restricted to monotonic classical logic, matching the deductive structure of the four stress tests (rule deletion, contradiction injection, rewrites, and multi-law stacking). We will expand the RLVF section to explicitly state this scope, note that negation-as-failure and non-monotonic reasoning fall outside the current benchmark, and discuss why the chosen oracle is sufficient for the logic-inertia phenomena we target. We also agree that the Phase-2 Lean evaluation is limited to classically-derivable (T) questions; the revised text will clarify this restriction and outline it as a direction for future extension rather than a complete solution. revision: partial
Circularity Check
No significant circularity; results grounded in external symbolic oracles
full rationale
The paper defines four author-introduced stress tests to expose logic inertia in LLMs, then presents a four-stage Conflict-Aware Fusion pipeline (SFT, DPO, LIRE, RLVF) whose RLVF stage uses an independent symbolic forward-chaining engine as a deterministic oracle for rewards. A Phase-2 extension further substitutes a Lean 4 kernel, reporting 99.0% agreement on a stratified sample of classically-derivable questions. These external verification mechanisms supply the training signal and validation, so saturation on the stress tests is an empirical outcome of the training procedure rather than a quantity forced by construction from the test definitions themselves. No self-citations, self-definitional reductions, fitted parameters renamed as predictions, or ansatz smuggling appear in the derivation chain. The central claims therefore retain independent content against the external oracles and benchmarks.
Axiom & Free-Parameter Ledger
free parameters (1)
- hyperparameters across SFT, DPO, LIRE, and RLVF stages
axioms (1)
- domain assumption The symbolic forward-chaining engine (and Lean kernel) provides accurate, complete, and unbiased logical verification for the generated outputs.
invented entities (2)
-
Logic Inertia
no independent evidence
-
Conflict-Aware Fusion
no independent evidence
Lean theorems connected to this paper
-
IndisputableMonolith/Foundation/ArithmeticFromLogic.leanLogicNat ≃ Nat (recovery theorem) echoes?
echoesECHOES: this paper passage has the same mathematical shape or conceptual pattern as the Recognition theorem, but is not a direct formal dependency.
Phase 2 extension that replaces the propositional oracle with a Lean 4 kernel, attaining 99.0% kernel agreement on the 105 classically-derivable (T) questions
-
IndisputableMonolith/Foundation/AbsoluteFloorClosure.leanabsolute_floor_iff_bare_distinguishability unclear?
unclearRelation between the paper passage and the cited Recognition theorem.
Cognitive Structure Hypothesis: Reliable multi-step reasoning requires an explicit structural separation between premise verification and deductive execution
What do these tags mean?
- matches
- The paper's claim is directly supported by a theorem in the formal canon.
- supports
- The theorem supports part of the paper's argument, but the paper may add assumptions or extra steps.
- extends
- The paper goes beyond the formal theorem; the theorem is a base layer rather than the whole result.
- uses
- The paper appears to rely on the theorem as machinery.
- contradicts
- The paper's claim conflicts with a theorem or certificate in the canon.
- unclear
- Pith found a possible connection, but the passage is too broad, indirect, or ambiguous to say the theorem truly supports the claim.
discussion (0)
Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.