DisjunctiveNet: Neural Symbolic Learning via Differentiable Convexified Optimization Layers

Can Li; Shraman Pal

arxiv: 2605.30456 · v2 · pith:DXJDHFVXnew · submitted 2026-05-28 · 💻 cs.LG · math.OC

DisjunctiveNet: Neural Symbolic Learning via Differentiable Convexified Optimization Layers

Shraman Pal , Can Li This is my paper

Pith reviewed 2026-06-29 08:39 UTC · model grok-4.3

classification 💻 cs.LG math.OC

keywords neuro-symbolic learningdifferentiable optimization layersdisjunctive constraintsconvex relaxationsmixed integer linear constraintsneural networks with hard constraints

0 comments

The pith

Neural networks can enforce exact satisfaction of input-dependent mixed-integer rules by embedding convex relaxations of disjunctive constraints as differentiable optimization layers.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

Many learning tasks involve sparse data paired with domain rules expressed as logical propositions and linear inequalities. Existing neuro-symbolic methods typically rely on soft penalties, fixed rules, or non-differentiable post-processing to handle constraints. The paper develops an approach that represents rules as disjunctive constraints, applies hierarchical convex relaxations to produce convex hull formulations, and inserts the resulting linear programs as differentiable layers inside neural networks. If the relaxations are tight, this yields end-to-end trainable models that satisfy the original rules exactly at inference time. Experiments on real-world datasets report both perfect rule compliance and competitive accuracy.

Core claim

The paper claims that representing rules as disjunctive constraints and applying hierarchical convex relaxations yields tractable linear constraints that embed as differentiable optimization layers, enabling end-to-end neural network training with exact satisfaction of hard, input-dependent mixed integer linear constraints.

What carries the argument

Hierarchical convex relaxations of disjunctive constraints that produce convex hull formulations embeddable as differentiable linear optimization layers.

Load-bearing premise

The hierarchical convex relaxations produce optimal solutions that coincide exactly with the feasible set of the original mixed-integer constraints for the rules encountered in the target applications.

What would settle it

A test input where the network output after the optimization layer violates one of the original logical rules would show that the relaxation does not preserve exact feasibility.

Figures

Figures reproduced from arXiv: 2605.30456 by Can Li, Shraman Pal.

**Figure 1.** Figure 1: CNF and DNF illustration for 2 rules with 2 disjuncts each. The nonconvex set we are trying to convexify is [PITH_FULL_IMAGE:figures/full_fig_p004_1.png] view at source ↗

**Figure 2.** Figure 2: Lifted projection for two active rules. The top number line shows two active rules, C1 = S11 ∪S12 and C2 = S21 ∪S22, whose DNF intersections give S1∪S2 = [0, 3]∪[7, 10]. Following Eqs. (5)-(12), these DNF terms are lifted into (y, η)-space as Sb1 and Sb2. For yˆ = 4.5, minimizing η over the lifted convex hull returns the extreme point y ⋆ = 3, which satisfies both rules exactly. A formal version of the the… view at source ↗

**Figure 3.** Figure 3: Synthetic task: Sequential convexification. Tightening the relaxation from CNF toward DNF improves constraint satisfaction (CSAT) and improves MSE, across two training set sizes. The CNF achieves significantly less constraint satisfaction in the OOD test set, and higher MSE compared to DNF and the intermediates. The plateau in CSAT is observed due to the fact that very few samples have such a large number … view at source ↗

**Figure 4.** Figure 4: Synthetic task: Dataset size. Performance of different methods for increasing amounts of training data. Projection-based methods (CNF, DNF) achieve substantially lower MSE and higher rule satisfaction than the base and penalty-based baselines. DNF achieves complete constraint satisfaction across both IID and OOD test sets [PITH_FULL_IMAGE:figures/full_fig_p008_4.png] view at source ↗

**Figure 5.** Figure 5 [PITH_FULL_IMAGE:figures/full_fig_p009_5.png] view at source ↗

read the original abstract

Many learning tasks in science and engineering are characterized by sparse datasets, which limits the effectiveness of purely data-driven approaches. At the same time, these problems are often accompanied by rich domain knowledge derived from physical laws, operational requirements, and expert heuristics. Such knowledge is frequently expressed as rules involving logical propositions and linear inequalities. Existing neuro-symbolic methods typically enforce these rules approximately through soft penalties, assume input-independent rules when designing specialized architectures, or rely on non-differentiable post-processing at inference time to achieve hard constraint satisfaction. While recent advances in differentiable optimization layers enable end-to-end feasibility enforcement within neural networks, extending these approaches to logical or mixed-integer rules remains challenging due to inherent nonconvexity. In this work, we propose a unified end-to-end framework for enforcing hard, input-dependent mixed integer linear constraints within neural networks. Our approach represents rules as disjunctive constraints and applies hierarchical convex relaxations to obtain convex hull formulations. These relaxations yield tractable linear constraints that can be embedded as differentiable optimization layers while enabling exact rule satisfaction. We demonstrate the effectiveness of the proposed framework on real-world datasets, achieving perfect rule satisfaction and strong predictive performance.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Desk Editor's Note private letter to a colleague

The paper gives a framework for turning input-dependent disjunctive MIL rules into differentiable convex layers, but whether the relaxations actually deliver exact satisfaction is the part that needs checking.

read the letter

The core idea is to represent logical and linear rules as disjunctive constraints, then apply hierarchical convex relaxations to produce linear programs that can sit inside a neural net as an optimization layer. This is positioned as a way to get hard, input-dependent constraint satisfaction without penalties or post-processing.

What stands out is the attempt to handle mixed-integer rules in a unified, end-to-end manner for sparse-data settings where domain rules matter. The abstract correctly notes that most existing neuro-symbolic work either softens the constraints or assumes they do not depend on the input. Extending differentiable optimization layers to this case is a reasonable next step.

The main uncertainty is whether the relaxations are tight. The claim of exact rule satisfaction requires that the convex hull formulations have no gap relative to the original mixed-integer set. Standard big-M or lifted relaxations often leave fractional solutions that violate the logic; the paper asserts that its hierarchy closes this for the target rules, but the abstract supplies no derivation or counter-example check. Without seeing the specific construction or the numerical results, it is difficult to know if the method works as stated or only approximately.

Experiments are mentioned as showing perfect satisfaction and competitive prediction, yet the protocols, baselines, and rule sets are not visible here. That makes it hard to judge how much the framework contributes versus careful problem design.

This is for researchers already working on differentiable optimization layers or neuro-symbolic methods who need to embed hard rules. A reader looking for a practical route to constraint satisfaction in scientific ML would find the direction useful if the tightness holds.

It is worth sending to referees because the problem is concrete and the proposed construction is a direct extension of existing layer techniques. The math and experiments will decide whether the exactness claim lands.

Referee Report

1 major / 0 minor

Summary. The paper proposes DisjunctiveNet, a unified end-to-end framework for embedding hard, input-dependent mixed-integer linear constraints (represented as disjunctive constraints) into neural networks. It applies hierarchical convex relaxations to obtain convex-hull formulations that are embedded as differentiable optimization layers, claiming this enables exact rule satisfaction at inference while supporting training with strong predictive performance on real-world datasets with sparse data and domain knowledge.

Significance. If the hierarchical convex relaxations are shown to produce tight convex-hull formulations whose optima coincide exactly with the original mixed-integer feasible sets for the target rule structures, the work would advance neuro-symbolic learning by providing a principled way to enforce complex logical rules differentiably without soft penalties or non-differentiable post-processing.

major comments (1)

[Abstract] Abstract (paragraph on the proposed approach): The central claim that the hierarchical convex relaxations 'yield tractable linear constraints ... while enabling exact rule satisfaction' and produce 'convex hull formulations' requires that the relaxations are tight (i.e., their optimal solutions coincide with the mixed-integer feasible set). The manuscript must provide a proof, explicit conditions, or empirical verification that no fractional solutions remain for the input-dependent disjunctions considered; otherwise the differentiable layer can return points satisfying the continuous relaxation but violating the original logical rules.

Simulated Author's Rebuttal

1 responses · 0 unresolved

We thank the referee for the careful reading and for identifying the need to substantiate the tightness of the relaxations. We respond to the major comment below.

read point-by-point responses

Referee: [Abstract] Abstract (paragraph on the proposed approach): The central claim that the hierarchical convex relaxations 'yield tractable linear constraints ... while enabling exact rule satisfaction' and produce 'convex hull formulations' requires that the relaxations are tight (i.e., their optimal solutions coincide with the mixed-integer feasible set). The manuscript must provide a proof, explicit conditions, or empirical verification that no fractional solutions remain for the input-dependent disjunctions considered; otherwise the differentiable layer can return points satisfying the continuous relaxation but violating the original logical rules.

Authors: We agree that the claim of exact rule satisfaction at inference requires the relaxations to be tight. The manuscript constructs the hierarchical convex relaxations precisely to recover convex-hull formulations of the input-dependent disjunctive sets (see Sections 3.2–3.3), with the layer solved via the resulting linear program. To make this explicit, we will add a dedicated subsection in the revised manuscript that states the conditions under which the hierarchy yields the convex hull, includes a short proof sketch for the disjunctive structures considered, and reports an empirical check confirming that the layer outputs lie at vertices of the original mixed-integer feasible set on the evaluated rule families. revision: yes

Circularity Check

0 steps flagged

No significant circularity in derivation chain

full rationale

The paper proposes a framework that represents rules as disjunctive MIL constraints and applies hierarchical convex relaxations to produce convex-hull linear programs embeddable as differentiable layers. The central claim that these relaxations enable exact rule satisfaction rests on the asserted coincidence of the relaxed optima with the original mixed-integer feasible set for the target rule structures. No equations or steps in the provided text reduce a prediction or result to a fitted parameter by construction, nor does any load-bearing premise collapse to a self-citation chain; the derivation is presented as a direct technical construction from standard convex-relaxation techniques. The method is therefore self-contained against external benchmarks of the relaxation properties.

Axiom & Free-Parameter Ledger

0 free parameters · 0 axioms · 0 invented entities

Abstract-only review supplies no explicit free parameters, background axioms, or invented entities; full manuscript required for ledger construction.

pith-pipeline@v0.9.1-grok · 5730 in / 1165 out tokens · 25716 ms · 2026-06-29T08:39:22.068156+00:00 · methodology

discussion (0)

Reference graph

Works this paper leans on

5 extracted references · 4 canonical work pages

[1]

Ceria, S

doi: 10.1287/ijoc.2022.0283. Ceria, S. and Soares, J. Convex programming for disjunctive convex optimization.Mathematical Programming, 86(3): 595–614, 1999. Chen, H., Flores, G. E. C., and Li, C. Physics-informed neural networks with hard linear equality constraints. Computers & Chemical Engineering, 189:108764, 10

work page doi:10.1287/ijoc.2022.0283 2022
[2]

doi: 10.1016/j.compchemeng

ISSN 00981354. doi: 10.1016/j.compchemeng. 2024.108764. Chen, R. T., Rubanova, Y ., Bettencourt, J., and Duvenaud, D. K. Neural ordinary differential equations.Advances in neural information processing systems, 31, 2018. Constante-Flores, G. E., Chen, H., and Li, C. Enforcing hard linear constraints in deep learning models with decision rules.Advances in ...

work page doi:10.1016/j.compchemeng 2024
[3]

Fischer, M., Balunovic, M., Drachsler-Cohen, D., Gehr, T., Zhang, C., and Vechev, M

Springer, 2020. Fischer, M., Balunovic, M., Drachsler-Cohen, D., Gehr, T., Zhang, C., and Vechev, M. DL2: Training and query- ing neural networks with logic. In Chaudhuri, K. and Salakhutdinov, R. (eds.),Proceedings of the 36th Inter- national Conference on Machine Learning, volume 97 of Proceedings of Machine Learning Research, pp. 1931–

2020
[4]

Frerix, T., Niesner, M., and Cremers, D

PMLR, 09–15 Jun 2019. Frerix, T., Niesner, M., and Cremers, D. Homogeneous linear inequality constraints for neural network activa- tions. In2020 IEEE/CVF Conference on Computer Vi- sion and Pattern Recognition Workshops (CVPRW), pp. 3229–3234. IEEE, 6 2020. ISBN 978-1-7281-9360-1. doi: 10.1109/CVPRW50498.2020.00382. Giunchiglia, E., Stoian, M. C., and Lu...

work page doi:10.1109/cvprw50498.2020.00382 2019
[5]

Multi-agentmotionplanningusingdifferentialgameswithlexicographicpreferences,

URL https://openreview.net/forum? id=rx0TCew0Lj. Tabas, D. and Zhang, B. Safe and efficient model predictive control using neural networks: An interior point approach. In2022 IEEE 61st Conference on Decision and Control (CDC), pp. 1142–1147. IEEE, 12 2022. ISBN 978-1- 6654-6761-2. doi: 10.1109/CDC51059.2022.9993046. Tordesillas, J., How, J. P., and Hutter...

work page doi:10.1109/cdc51059.2022.9993046 2022

[1] [1]

Ceria, S

doi: 10.1287/ijoc.2022.0283. Ceria, S. and Soares, J. Convex programming for disjunctive convex optimization.Mathematical Programming, 86(3): 595–614, 1999. Chen, H., Flores, G. E. C., and Li, C. Physics-informed neural networks with hard linear equality constraints. Computers & Chemical Engineering, 189:108764, 10

work page doi:10.1287/ijoc.2022.0283 2022

[2] [2]

doi: 10.1016/j.compchemeng

ISSN 00981354. doi: 10.1016/j.compchemeng. 2024.108764. Chen, R. T., Rubanova, Y ., Bettencourt, J., and Duvenaud, D. K. Neural ordinary differential equations.Advances in neural information processing systems, 31, 2018. Constante-Flores, G. E., Chen, H., and Li, C. Enforcing hard linear constraints in deep learning models with decision rules.Advances in ...

work page doi:10.1016/j.compchemeng 2024

[3] [3]

Fischer, M., Balunovic, M., Drachsler-Cohen, D., Gehr, T., Zhang, C., and Vechev, M

Springer, 2020. Fischer, M., Balunovic, M., Drachsler-Cohen, D., Gehr, T., Zhang, C., and Vechev, M. DL2: Training and query- ing neural networks with logic. In Chaudhuri, K. and Salakhutdinov, R. (eds.),Proceedings of the 36th Inter- national Conference on Machine Learning, volume 97 of Proceedings of Machine Learning Research, pp. 1931–

2020

[4] [4]

Frerix, T., Niesner, M., and Cremers, D

PMLR, 09–15 Jun 2019. Frerix, T., Niesner, M., and Cremers, D. Homogeneous linear inequality constraints for neural network activa- tions. In2020 IEEE/CVF Conference on Computer Vi- sion and Pattern Recognition Workshops (CVPRW), pp. 3229–3234. IEEE, 6 2020. ISBN 978-1-7281-9360-1. doi: 10.1109/CVPRW50498.2020.00382. Giunchiglia, E., Stoian, M. C., and Lu...

work page doi:10.1109/cvprw50498.2020.00382 2019

[5] [5]

Multi-agentmotionplanningusingdifferentialgameswithlexicographicpreferences,

URL https://openreview.net/forum? id=rx0TCew0Lj. Tabas, D. and Zhang, B. Safe and efficient model predictive control using neural networks: An interior point approach. In2022 IEEE 61st Conference on Decision and Control (CDC), pp. 1142–1147. IEEE, 12 2022. ISBN 978-1- 6654-6761-2. doi: 10.1109/CDC51059.2022.9993046. Tordesillas, J., How, J. P., and Hutter...

work page doi:10.1109/cdc51059.2022.9993046 2022