Smooth Piecewise Cutting for Neural Operator to Handle Discontinuities and Sharp Transitions
Pith reviewed 2026-05-20 07:02 UTC · model grok-4.3
The pith
Cut-DeepONet partitions PDE domains into smooth subregions to learn operators around discontinuities without approximating them directly.
A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.
Core claim
Reformulating the operator learning problem via a lifting strategy partitions the domain into smooth subregions, with discontinuities represented as boundaries in a higher-dimensional space, so that an auxiliary network can predict input-dependent cut locations for unseen inputs and the neural operator can generate accurate smooth components in each region without directly approximating the discontinuities.
What carries the argument
The lifting strategy that partitions the domain into smooth subregions by representing discontinuities as boundaries in a higher-dimensional space.
If this is right
- The method outperforms state-of-the-art neural operators on benchmark PDEs that contain discontinuities and sharp transitions.
- Performance remains strong even when the model is trained only on low-resolution datasets.
- Fewer trainable parameters are required compared with approaches that increase model capacity to approximate discontinuities inside continuous function spaces.
- Separating discontinuity modeling from smooth solution learning reduces overall training complexity.
Where Pith is reading between the lines
- The same lifting-plus-partitioning idea could be tested on operator learning tasks outside PDEs, such as learning mappings between functions that contain jumps or interfaces.
- Extending the auxiliary predictor to output uncertainty estimates on cut locations might improve robustness when discontinuities move with the input.
- The two-stage structure suggests a template for other neural architectures that currently force non-smooth behavior into smooth layers.
Load-bearing premise
An auxiliary network can accurately predict input-dependent discontinuity locations for unseen inputs, and the domain partitioning preserves all necessary information without new errors at the boundaries.
What would settle it
A set of test inputs where the auxiliary network's predicted discontinuity locations differ substantially from the true locations and the resulting solution error exceeds that of a standard neural operator trained on the same data.
Figures
read the original abstract
Neural operators have achieved strong performance in learning solution operators of partial differential equations (PDEs), but their inherently continuous representations struggle to capture discontinuities and sharp transitions. Existing approaches typically approximate such features within continuous function spaces, often requiring increased model capacity and high-resolution data. In this work, we propose Cut-DeepONet, a two-stage training framework that explicitly models discontinuities while reducing learning complexity. Our approach reformulates the problem via a lifting strategy, partitioning the domain into smooth subregions while representing discontinuities as boundaries in a higher-dimensional space. This separation aligns the operator learning task with the inductive bias of neural networks and avoids directly approximating discontinuities. An additional network predicts input-dependent discontinuity locations for unseen inputs, which are then used to guide the neural operator in generating smooth components within each region. Experiments on benchmark PDEs show that Cut-DeepONet outperforms state-of-the-art methods, even when trained on low-resolution datasets. The method excels on problems with discontinuities and sharp transitions, while using fewer trainable parameters. Our results highlight the benefits of changing the representation of operator learning rather than increasing model complexity.
Editorial analysis
A structured set of objections, weighed in public.
Referee Report
Summary. The manuscript presents Cut-DeepONet, a two-stage training framework for neural operators that explicitly models discontinuities in PDE solutions via a lifting strategy. The domain is partitioned into smooth subregions treated as boundaries in a higher-dimensional space, with an auxiliary network predicting input-dependent discontinuity locations for unseen inputs; the main operator then learns the smooth components within each region. Experiments on benchmark PDEs are reported to show outperformance over state-of-the-art methods even on low-resolution training data while using fewer trainable parameters.
Significance. If the empirical results hold under scrutiny, the work could be significant for neural operator research by demonstrating that representational changes (explicit partitioning and lifting) can address limitations with discontinuous functions more effectively than increasing model capacity or data resolution. This aligns operator learning better with the inductive biases of neural networks and may extend to other scientific ML settings involving interfaces or shocks.
major comments (3)
- [§3.2] §3.2 (Auxiliary Network and Lifting): The central claim that discontinuities are avoided for unseen inputs rests on the auxiliary network accurately predicting cut locations. No quantitative bounds on prediction error, sensitivity analysis, or ablation showing how main-operator error scales with location error (e.g., >1-2% domain length) are provided; systematic misalignment would reintroduce jumps at the predicted boundaries and undermine the partitioning benefit.
- [§4] §4 (Experiments): The reported outperformance on low-resolution datasets and with fewer parameters lacks error bars, details on data splits, number of runs, or statistical significance tests. This weakens the ability to verify the strongest claim that the method reliably excels on problems with discontinuities and sharp transitions.
- [§2.2] §2.2 (Lifting Strategy, Eq. (5) or equivalent): The mathematical description of how the lifted partitioning preserves all necessary information and maps outputs back to the original domain without introducing new boundary artifacts is insufficiently precise, making it hard to assess whether the separation truly aligns with neural network biases without loss.
minor comments (2)
- [Abstract] The abstract would be strengthened by naming at least one benchmark PDE and including a concrete quantitative improvement (e.g., relative L2 error reduction) rather than the generic statement of outperformance.
- [§2] Notation for the lifted space and subregion combination could be made more explicit to avoid potential confusion with standard DeepONet branch/trunk formulations.
Simulated Author's Rebuttal
We thank the referee for the thoughtful and constructive feedback on our manuscript. We have carefully considered each major comment and provide point-by-point responses below. Where appropriate, we have revised the manuscript to address the concerns raised.
read point-by-point responses
-
Referee: [§3.2] §3.2 (Auxiliary Network and Lifting): The central claim that discontinuities are avoided for unseen inputs rests on the auxiliary network accurately predicting cut locations. No quantitative bounds on prediction error, sensitivity analysis, or ablation showing how main-operator error scales with location error (e.g., >1-2% domain length) are provided; systematic misalignment would reintroduce jumps at the predicted boundaries and undermine the partitioning benefit.
Authors: We appreciate this observation. While we do not provide theoretical bounds on the prediction error of the auxiliary network, our experiments demonstrate that the auxiliary network achieves high accuracy in predicting discontinuity locations across the tested benchmarks. To further address this, we will include a sensitivity analysis in the revised manuscript, showing the impact of controlled perturbations in the predicted cut locations on the main operator's performance. This will quantify how errors in location prediction affect the overall error, particularly for misalignments exceeding 1-2% of the domain length. revision: yes
-
Referee: [§4] §4 (Experiments): The reported outperformance on low-resolution datasets and with fewer parameters lacks error bars, details on data splits, number of runs, or statistical significance tests. This weakens the ability to verify the strongest claim that the method reliably excels on problems with discontinuities and sharp transitions.
Authors: We agree that including error bars, details on data splits, number of runs, and statistical significance would strengthen the experimental section. In the revised version, we will report results averaged over multiple independent runs with different random seeds, include standard deviations as error bars, specify the train/validation/test splits used, and perform statistical significance tests (e.g., paired t-tests) to compare against baseline methods. This will provide a more rigorous validation of the outperformance claims. revision: yes
-
Referee: [§2.2] §2.2 (Lifting Strategy, Eq. (5) or equivalent): The mathematical description of how the lifted partitioning preserves all necessary information and maps outputs back to the original domain without introducing new boundary artifacts is insufficiently precise, making it hard to assess whether the separation truly aligns with neural network biases without loss.
Authors: We thank the referee for pointing this out. The current description in §2.2 aims to convey the core idea of the lifting strategy, but we acknowledge it could be more precise. In the revision, we will expand the mathematical formulation to explicitly detail the lifting operator, how the partitioning into subregions is represented in the higher-dimensional space, the preservation of information, and the exact mapping procedure back to the original domain. We will also clarify why this does not introduce additional boundary artifacts, ensuring the separation aligns with the inductive biases of neural networks. revision: yes
Circularity Check
No circularity: explicit auxiliary network and lifting strategy are independent modeling choices
full rationale
The paper presents Cut-DeepONet as a two-stage framework that trains an auxiliary network to predict input-dependent discontinuity locations and then applies a lifting/partitioning step to reformulate the operator learning task over smooth subregions. These are architectural decisions whose correctness is evaluated empirically on benchmark PDEs rather than derived from a closed chain of equations that reduce to fitted quantities or self-citations by construction. No self-definitional relations, predictions that are statistically forced by the same fit, or load-bearing uniqueness theorems imported from prior author work appear in the provided description. The central performance claims rest on experimental comparisons, which remain falsifiable outside any internal definitions.
Axiom & Free-Parameter Ledger
axioms (1)
- domain assumption Neural networks possess an inductive bias favoring smooth continuous functions
Lean theorems connected to this paper
-
IndisputableMonolith/Cost/FunctionalEquation.leanwashburn_uniqueness_aczel unclear?
unclearRelation between the paper passage and the cited Recognition theorem.
we reformulate the problem with discontinuities by lifting it into a higher-dimensional space, in which the domain is decomposed into multiple subdomains such that discontinuities occur only at their boundaries
-
IndisputableMonolith/Foundation/ArithmeticFromLogic.leanabsolute_floor_iff_bare_distinguishability unclear?
unclearRelation between the paper passage and the cited Recognition theorem.
An additional network predicts input-dependent discontinuity locations for unseen inputs
What do these tags mean?
- matches
- The paper's claim is directly supported by a theorem in the formal canon.
- supports
- The theorem supports part of the paper's argument, but the paper may add assumptions or extra steps.
- extends
- The paper goes beyond the formal theorem; the theorem is a base layer rather than the whole result.
- uses
- The paper appears to rely on the theorem as machinery.
- contradicts
- The paper's claim conflicts with a theorem or certificate in the canon.
- unclear
- Pith found a possible connection, but the passage is too broad, indirect, or ambiguous to say the theorem truly supports the claim.
Reference graph
Works this paper leans on
-
[1]
Learning nonlinear operators via DeepONet based on the universal approximation theorem of operators
Lu Lu, Pengzhan Jin, Guofei Pang, Zhongqiang Zhang, and George Em Karniadakis. Learn- ing nonlinear operators via deeponet based on the universal approximation theorem of operators.Nature Machine Intelligence, 3(3):218–229, 2021. ISSN 2522-5839. doi: 10.1038/s42256-021-00302-5
-
[2]
Fourier Neural Operator for Parametric Partial Differential Equations
Zongyi Li, Nikola Kovachki, Kamyar Azizzadenesheli, Burigede Liu, Kaushik Bhattacharya, Andrew Stuart, and Anima Anandkumar. Fourier neural operator for parametric partial differ- ential equations.International Conference on Learning Representations (ICLR), 2021. doi: 10.48550/arXiv.2010.08895
work page internal anchor Pith review Pith/arXiv arXiv doi:10.48550/arxiv.2010.08895 2021
-
[3]
Prajwal Chauhan, Salah Eddine Choutri, Mohamed Ghattassi, Nader Masmoudi, and Saif Eddin Jabari. Neural operators struggle to learn complex pdes in pedestrian mobility: Hughes model case study.Artificial Intelligence for Transportation, 1:100005, 2025. ISSN 3050-8606. doi: 10.1016/j.ait.2025.100005
-
[4]
Samuel Lanthaler, Roberto Molinaro, Patrik Hadorn, and Siddhartha Mishra. Nonlinear re- construction for operator learning of pdes with discontinuities.International Conference on Learning Representations (ICLR), 2023. doi: 10.48550/arXiv.2210.01074
-
[5]
Patrik Simon Hadorn. Shift-deeponet: Extending deep operator networks for discontinuous output functions.Master’s Thesis, ETH Zurich, 2022. doi: 10.3929/ethz-b-000539793
-
[6]
Seidman, Georgios Kissas, Paris Perdikaris, and George J
Jacob H. Seidman, Georgios Kissas, Paris Perdikaris, and George J. Pappas. Nomad: Nonlinear manifold decoders for operator learning.Advances in Neural Information Processing Systems (NeurIPS), 35, 2022. doi: 10.48550/arXiv.2206.03551
-
[7]
SVD perspectives for augmenting DeepONet flexibility and interpretability
Simone Venturi et al. Svd perspectives for augmenting deeponet flexibility and interpretability. Computer Methods in Applied Mechanics and Engineering, 404:115782, 2023. ISSN 0045-7825. doi: 10.1016/j.cma.2022.115718
-
[8]
Jae Yong Lee, Sung Woong Cho, and Hyung Ju Hwang. Hyperdeeponet: Learning operator with complex target function space using limited resources via hypernetwork.Proceedings of the International Conference on Learning Representations (ICLR), 2023. doi: 10.48550/arXiv. 2312.15949
work page internal anchor Pith review doi:10.48550/arxiv 2023
-
[9]
Yue Chang, Mengfei Liu, Zhecheng Wang, Peter Yichen Chen, and Eitan Grinspun. Lifting the winding number: Precise discontinuities in neural fields for physics simulation.ACM SIGGRAPH 2025 Conference Papers, pages 1–11, 2025. doi: 10.1145/3721238.3730597
-
[10]
Simula SpringerBriefs on Computing
Karoline Horgmo Jæger and Aslak Tveito.Differential Equations for Studies in Computational Electrophysiology. Simula SpringerBriefs on Computing. Springer Nature Switzerland, 2023. doi: 10.1007/978-3-031-30852-9
-
[11]
On the gibbs phenomenon and its resolution.SIAM Review, 39(4):644–668, 1997
David Gottlieb and Chi-Wang Shu. On the gibbs phenomenon and its resolution.SIAM Review, 39(4):644–668, 1997. doi: 10.1137/S0036144596301390
-
[12]
Ami Harten, Bjorn Engquist, Stanley Osher, and Sukumar R. Chakravarthy. Uniformly high order accurate essentially non-oscillatory schemes, iii.Journal of Computational Physics, 131 (1):3–47, 1997. ISSN 0021-9991. doi: 10.1006/jcph.1996.5632
-
[13]
Xu-Dong Liu, Stanley Osher, and Tony Chan. Weighted essentially non-oscillatory schemes. Journal of Computational Physics, 115(1):200–212, 1994. ISSN 0021-9991. doi: 10.1006/jcph. 1994.1187
-
[14]
Beck, Jonas Zeifang, Anna Schwarz, and David G
Andrea D. Beck, Jonas Zeifang, Anna Schwarz, and David G. Flad. A neural network based shock detection and localization approach for discontinuous galerkin methods.Journal of Computational Physics, 423:109824, 2020. ISSN 0021-9991. doi: 10.1016/j.jcp.2020.109824
-
[15]
Alena Kopani ˇcáková, Hardik Kothari, George E. Karniadakis, and Rolf Krause. Enhanc- ing training of physics-informed neural networks using domain decomposition–based pre- conditioning strategies.SIAM Journal on Scientific Computing, 46(5):S46–S67, 2024. doi: 10.1137/23M1583375. 10
-
[16]
Jagtap, Ehsan Kharazmi, and George Em Karniadakis
Ameya D. Jagtap, Ehsan Kharazmi, and George Em Karniadakis. Conservative physics-informed neural networks on discrete domains for conservation laws: Applications to forward and inverse problems.Computer Methods in Applied Mechanics and Engineering, 365:113028, 2020. ISSN 0045-7825. doi: 10.1016/j.cma.2020.113028
-
[17]
Wei-Fan Hu, Te-Sheng Lin, and Ming-Chih Lai. A discontinuity capturing shallow neural network for elliptic interface problems.Journal of Computational Physics, 469:111576, 2022. ISSN 0021-9991. doi: 10.1016/j.jcp.2022.111576
-
[18]
Luyang Zhao and Qian Shao. Denns: Discontinuity-embedded neural networks for fracture mechanics.Computer Methods in Applied Mechanics and Engineering, 446:118184, 2025. ISSN 0045-7825. doi: 10.1016/j.cma.2025.118184
-
[19]
$\phi-$DeepONet: A Discontinuity Capturing Neural Operator
Sumanta Roy, Stephen T. Castonguay, Pratanu Roy, and Michael D. Shields.ϕ−deeponet: A discontinuity capturing neural operator. 2026. doi: 10.48550/arXiv.2604.08076
work page internal anchor Pith review Pith/arXiv arXiv doi:10.48550/arxiv.2604.08076 2026
-
[20]
Chen, Vivek Ooomen, Jérôme Darbon, and George Em Karniadakis
Juan Diego Toscano, Daniel T. Chen, Vivek Ooomen, Jérôme Darbon, and George Em Karniadakis. A variational framework for residual-based adaptivity in neural pde solvers and operator learning.npj Artificial Intelligence, 2(1):32, 2026. ISSN 3005-1460. doi: 10.1038/s44387-026-00084-4
-
[22]
Linear optimization over per- mutation groups
Yuling Jiao, Di Li, Xiliang Lu, Jerry Zhijian Yang, and Cheng Yuan. A gaussian mixture distribution-based adaptive sampling method for physics-informed neural networks.Engineering Applications of Artificial Intelligence, 135:108770, 2024. ISSN 0952-1976. doi: 10.1016/j. engappai.2024.108770
work page doi:10.1016/j 2024
-
[23]
Yameng Zhu, Jingrun Chen, and Weibing Deng. R-adaptive deeponet: Learning solution operators for pdes with discontinuous solutions using an r-adaptive strategy. 2024. doi: 10.48550/arXiv.2408.04157
-
[24]
Tapas Tripura and Souvik Chakraborty. Wavelet neural operator for solving parametric partial differential equations in computational mechanics problems.Computer Methods in Applied Mechanics and Engineering, 404:115783, 2023. ISSN 0045-7825. doi: 10.1016/j.cma.2022. 115783
-
[25]
Giorrgio M. Cavallazzi, Miguel Perex Cuadrado, and Alfredo Pinelli. Walsh-hadamard neural operators for solving pdes with discontinuous coefficients. 2025. doi: 10.48550/arXiv.2511. 07347
-
[26]
Thomas O’Leary-Roseberry, Peng Chen, Umberto Villa, and Omar Ghattas. Derivative-informed neural operator: An efficient framework for high-dimensional parametric derivative learning. Journal of Computational Physics, 496:112555, 2024. doi: 10.1016/j.jcp.2023.112555
-
[27]
Yuan Qiu, Nolan Bridges, and Peng Chen. Derivative-enhanced deep operator network.Ad- vances in Neural Information Processing Systems (NeurIPS), 2024. doi: 10.52202/079017-0660
-
[28]
Discontinuous galerkin finite element operator network for solving non-smooth pdes
Kapil Chawla, Youngjoon Hong, Jae Yong Lee, and Sanghyun Lee. Discontinuous galerkin finite element operator network for solving non-smooth pdes. 2026. doi: 10.48550/arXiv.2601.03668
-
[29]
Shock-aware physics-guided fusion-deeponet operator for rarefied micro-nozzle flows
Ehsan Roohi and Amirmehran Mahdavi. Shock-aware physics-guided fusion-deeponet operator for rarefied micro-nozzle flows. 2025. doi: 10.48550/arXiv.2510.17887
-
[30]
Ahmad Peyvan, Varun Kumar, and George Em Karniadakis. Fusion-deeponet: A data-efficient neural operator for geometry-dependent hypersonic and supersonic flows. 2025. doi: 10.48550/ arXiv.2501.01934
-
[31]
Alec Jacobson, Ladislav Kavan, and Olga Sorkine-Hornung. Robust inside-outside segmentation using generalized winding numbers.ACM Transactions on Graphics, 32(4):33:1–33:12, July
-
[32]
doi: 10.1145/2461912.2461916. 11
-
[33]
Maria Cameron. Notes on Burger’s Equation. Lecture Notes, University of Maryland Mathemat- ics Department, February 2024. URL https://www.math.umd.edu/~mariakc/burgers. pdf. 12 A Compared method architectures Here, we briefly introduce the architectures used for comparison. These architectures are state-of-the- art and have been proposed in the literature...
work page 2024
discussion (0)
Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.