pith. sign in

arxiv: 2605.19823 · v1 · pith:E3E7GW6Bnew · submitted 2026-05-19 · 💻 cs.LG · cs.AI· math.AP· math.DS· stat.ML

Smooth Piecewise Cutting for Neural Operator to Handle Discontinuities and Sharp Transitions

Pith reviewed 2026-05-20 07:02 UTC · model grok-4.3

classification 💻 cs.LG cs.AImath.APmath.DSstat.ML
keywords neural operatorsdiscontinuitiesPDEspiecewise cuttingdomain liftingoperator learningsharp transitionslow-resolution training
0
0 comments X

The pith

Cut-DeepONet partitions PDE domains into smooth subregions to learn operators around discontinuities without approximating them directly.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

The paper introduces Cut-DeepONet as a two-stage framework that first predicts discontinuity locations with an auxiliary network and then applies a lifting strategy to divide the domain into smooth pieces. This reformulation lets the main neural operator focus on continuous functions within each subregion instead of struggling to capture jumps in a single continuous representation. The approach is tested on benchmark PDEs that feature discontinuities and sharp transitions. It delivers higher accuracy than prior neural operator methods while using fewer parameters and succeeding even with low-resolution training data.

Core claim

Reformulating the operator learning problem via a lifting strategy partitions the domain into smooth subregions, with discontinuities represented as boundaries in a higher-dimensional space, so that an auxiliary network can predict input-dependent cut locations for unseen inputs and the neural operator can generate accurate smooth components in each region without directly approximating the discontinuities.

What carries the argument

The lifting strategy that partitions the domain into smooth subregions by representing discontinuities as boundaries in a higher-dimensional space.

If this is right

  • The method outperforms state-of-the-art neural operators on benchmark PDEs that contain discontinuities and sharp transitions.
  • Performance remains strong even when the model is trained only on low-resolution datasets.
  • Fewer trainable parameters are required compared with approaches that increase model capacity to approximate discontinuities inside continuous function spaces.
  • Separating discontinuity modeling from smooth solution learning reduces overall training complexity.

Where Pith is reading between the lines

These are editorial extensions of the paper, not claims the author makes directly.

  • The same lifting-plus-partitioning idea could be tested on operator learning tasks outside PDEs, such as learning mappings between functions that contain jumps or interfaces.
  • Extending the auxiliary predictor to output uncertainty estimates on cut locations might improve robustness when discontinuities move with the input.
  • The two-stage structure suggests a template for other neural architectures that currently force non-smooth behavior into smooth layers.

Load-bearing premise

An auxiliary network can accurately predict input-dependent discontinuity locations for unseen inputs, and the domain partitioning preserves all necessary information without new errors at the boundaries.

What would settle it

A set of test inputs where the auxiliary network's predicted discontinuity locations differ substantially from the true locations and the resulting solution error exceeds that of a standard neural operator trained on the same data.

Figures

Figures reproduced from arXiv: 2605.19823 by Ha Dang, Juergen Hesser, Sebastian Schmidt.

Figure 1
Figure 1. Figure 1: Overview of the Cut-DeepONet architecture and training procedure. (A) From the original dataset, the discontinuity locations are extracted from G(u)(y) and used to construct both a Lifted Dataset and a Discontinuous Dataset. Stage 1: The Cutting Net is trained on the Discontinuous Dataset to predict the locations of discontinuities. Stage 2: The Lifted Dataset is used to guide the operator learning model t… view at source ↗
Figure 2
Figure 2. Figure 2: Cut-DeepONet outperforms other methods on the linear advection equation and maintains [PITH_FULL_IMAGE:figures/full_fig_p008_2.png] view at source ↗
Figure 3
Figure 3. Figure 3: Benchmark methods are introduced to handle problems with discontinuities. HyperDeep [PITH_FULL_IMAGE:figures/full_fig_p013_3.png] view at source ↗
Figure 4
Figure 4. Figure 4: Cut-DeepONet generates discontinuous solutions, which have higher [PITH_FULL_IMAGE:figures/full_fig_p016_4.png] view at source ↗
Figure 5
Figure 5. Figure 5: Extraction of discontinuities and sharp transitions from data under different data availability [PITH_FULL_IMAGE:figures/full_fig_p018_5.png] view at source ↗
Figure 6
Figure 6. Figure 6: One example compares different methods on the Inviscid Burger’s Equation [PITH_FULL_IMAGE:figures/full_fig_p019_6.png] view at source ↗
Figure 7
Figure 7. Figure 7: One example compares different methods on the parsimonious model. Regions near sharp [PITH_FULL_IMAGE:figures/full_fig_p019_7.png] view at source ↗
Figure 8
Figure 8. Figure 8: Low-resolution Godunov data are used for operator learning in Stage 2, while the Cutting [PITH_FULL_IMAGE:figures/full_fig_p020_8.png] view at source ↗
read the original abstract

Neural operators have achieved strong performance in learning solution operators of partial differential equations (PDEs), but their inherently continuous representations struggle to capture discontinuities and sharp transitions. Existing approaches typically approximate such features within continuous function spaces, often requiring increased model capacity and high-resolution data. In this work, we propose Cut-DeepONet, a two-stage training framework that explicitly models discontinuities while reducing learning complexity. Our approach reformulates the problem via a lifting strategy, partitioning the domain into smooth subregions while representing discontinuities as boundaries in a higher-dimensional space. This separation aligns the operator learning task with the inductive bias of neural networks and avoids directly approximating discontinuities. An additional network predicts input-dependent discontinuity locations for unseen inputs, which are then used to guide the neural operator in generating smooth components within each region. Experiments on benchmark PDEs show that Cut-DeepONet outperforms state-of-the-art methods, even when trained on low-resolution datasets. The method excels on problems with discontinuities and sharp transitions, while using fewer trainable parameters. Our results highlight the benefits of changing the representation of operator learning rather than increasing model complexity.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Referee Report

3 major / 2 minor

Summary. The manuscript presents Cut-DeepONet, a two-stage training framework for neural operators that explicitly models discontinuities in PDE solutions via a lifting strategy. The domain is partitioned into smooth subregions treated as boundaries in a higher-dimensional space, with an auxiliary network predicting input-dependent discontinuity locations for unseen inputs; the main operator then learns the smooth components within each region. Experiments on benchmark PDEs are reported to show outperformance over state-of-the-art methods even on low-resolution training data while using fewer trainable parameters.

Significance. If the empirical results hold under scrutiny, the work could be significant for neural operator research by demonstrating that representational changes (explicit partitioning and lifting) can address limitations with discontinuous functions more effectively than increasing model capacity or data resolution. This aligns operator learning better with the inductive biases of neural networks and may extend to other scientific ML settings involving interfaces or shocks.

major comments (3)
  1. [§3.2] §3.2 (Auxiliary Network and Lifting): The central claim that discontinuities are avoided for unseen inputs rests on the auxiliary network accurately predicting cut locations. No quantitative bounds on prediction error, sensitivity analysis, or ablation showing how main-operator error scales with location error (e.g., >1-2% domain length) are provided; systematic misalignment would reintroduce jumps at the predicted boundaries and undermine the partitioning benefit.
  2. [§4] §4 (Experiments): The reported outperformance on low-resolution datasets and with fewer parameters lacks error bars, details on data splits, number of runs, or statistical significance tests. This weakens the ability to verify the strongest claim that the method reliably excels on problems with discontinuities and sharp transitions.
  3. [§2.2] §2.2 (Lifting Strategy, Eq. (5) or equivalent): The mathematical description of how the lifted partitioning preserves all necessary information and maps outputs back to the original domain without introducing new boundary artifacts is insufficiently precise, making it hard to assess whether the separation truly aligns with neural network biases without loss.
minor comments (2)
  1. [Abstract] The abstract would be strengthened by naming at least one benchmark PDE and including a concrete quantitative improvement (e.g., relative L2 error reduction) rather than the generic statement of outperformance.
  2. [§2] Notation for the lifted space and subregion combination could be made more explicit to avoid potential confusion with standard DeepONet branch/trunk formulations.

Simulated Author's Rebuttal

3 responses · 0 unresolved

We thank the referee for the thoughtful and constructive feedback on our manuscript. We have carefully considered each major comment and provide point-by-point responses below. Where appropriate, we have revised the manuscript to address the concerns raised.

read point-by-point responses
  1. Referee: [§3.2] §3.2 (Auxiliary Network and Lifting): The central claim that discontinuities are avoided for unseen inputs rests on the auxiliary network accurately predicting cut locations. No quantitative bounds on prediction error, sensitivity analysis, or ablation showing how main-operator error scales with location error (e.g., >1-2% domain length) are provided; systematic misalignment would reintroduce jumps at the predicted boundaries and undermine the partitioning benefit.

    Authors: We appreciate this observation. While we do not provide theoretical bounds on the prediction error of the auxiliary network, our experiments demonstrate that the auxiliary network achieves high accuracy in predicting discontinuity locations across the tested benchmarks. To further address this, we will include a sensitivity analysis in the revised manuscript, showing the impact of controlled perturbations in the predicted cut locations on the main operator's performance. This will quantify how errors in location prediction affect the overall error, particularly for misalignments exceeding 1-2% of the domain length. revision: yes

  2. Referee: [§4] §4 (Experiments): The reported outperformance on low-resolution datasets and with fewer parameters lacks error bars, details on data splits, number of runs, or statistical significance tests. This weakens the ability to verify the strongest claim that the method reliably excels on problems with discontinuities and sharp transitions.

    Authors: We agree that including error bars, details on data splits, number of runs, and statistical significance would strengthen the experimental section. In the revised version, we will report results averaged over multiple independent runs with different random seeds, include standard deviations as error bars, specify the train/validation/test splits used, and perform statistical significance tests (e.g., paired t-tests) to compare against baseline methods. This will provide a more rigorous validation of the outperformance claims. revision: yes

  3. Referee: [§2.2] §2.2 (Lifting Strategy, Eq. (5) or equivalent): The mathematical description of how the lifted partitioning preserves all necessary information and maps outputs back to the original domain without introducing new boundary artifacts is insufficiently precise, making it hard to assess whether the separation truly aligns with neural network biases without loss.

    Authors: We thank the referee for pointing this out. The current description in §2.2 aims to convey the core idea of the lifting strategy, but we acknowledge it could be more precise. In the revision, we will expand the mathematical formulation to explicitly detail the lifting operator, how the partitioning into subregions is represented in the higher-dimensional space, the preservation of information, and the exact mapping procedure back to the original domain. We will also clarify why this does not introduce additional boundary artifacts, ensuring the separation aligns with the inductive biases of neural networks. revision: yes

Circularity Check

0 steps flagged

No circularity: explicit auxiliary network and lifting strategy are independent modeling choices

full rationale

The paper presents Cut-DeepONet as a two-stage framework that trains an auxiliary network to predict input-dependent discontinuity locations and then applies a lifting/partitioning step to reformulate the operator learning task over smooth subregions. These are architectural decisions whose correctness is evaluated empirically on benchmark PDEs rather than derived from a closed chain of equations that reduce to fitted quantities or self-citations by construction. No self-definitional relations, predictions that are statistically forced by the same fit, or load-bearing uniqueness theorems imported from prior author work appear in the provided description. The central performance claims rest on experimental comparisons, which remain falsifiable outside any internal definitions.

Axiom & Free-Parameter Ledger

0 free parameters · 1 axioms · 0 invented entities

The central claim rests on the domain assumption that neural networks are biased toward smooth functions and that discontinuities can be isolated as boundaries without loss of fidelity; no free parameters or invented entities are explicitly introduced in the abstract.

axioms (1)
  • domain assumption Neural networks possess an inductive bias favoring smooth continuous functions
    Invoked to justify why separating smooth subregions reduces learning complexity

pith-pipeline@v0.9.0 · 5734 in / 1247 out tokens · 41973 ms · 2026-05-20T07:02:57.113438+00:00 · methodology

discussion (0)

Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.

Lean theorems connected to this paper

Citations machine-checked in the Pith Canon. Every link opens the source theorem in the public Lean library.

What do these tags mean?
matches
The paper's claim is directly supported by a theorem in the formal canon.
supports
The theorem supports part of the paper's argument, but the paper may add assumptions or extra steps.
extends
The paper goes beyond the formal theorem; the theorem is a base layer rather than the whole result.
uses
The paper appears to rely on the theorem as machinery.
contradicts
The paper's claim conflicts with a theorem or certificate in the canon.
unclear
Pith found a possible connection, but the passage is too broad, indirect, or ambiguous to say the theorem truly supports the claim.

Reference graph

Works this paper leans on

32 extracted references · 32 canonical work pages · 3 internal anchors

  1. [1]

    Learning nonlinear operators via DeepONet based on the universal approximation theorem of operators

    Lu Lu, Pengzhan Jin, Guofei Pang, Zhongqiang Zhang, and George Em Karniadakis. Learn- ing nonlinear operators via deeponet based on the universal approximation theorem of operators.Nature Machine Intelligence, 3(3):218–229, 2021. ISSN 2522-5839. doi: 10.1038/s42256-021-00302-5

  2. [2]

    Fourier Neural Operator for Parametric Partial Differential Equations

    Zongyi Li, Nikola Kovachki, Kamyar Azizzadenesheli, Burigede Liu, Kaushik Bhattacharya, Andrew Stuart, and Anima Anandkumar. Fourier neural operator for parametric partial differ- ential equations.International Conference on Learning Representations (ICLR), 2021. doi: 10.48550/arXiv.2010.08895

  3. [3]

    Neural operators struggle to learn complex pdes in pedestrian mobility: Hughes model case study.Artificial Intelligence for Transportation, 1:100005, 2025

    Prajwal Chauhan, Salah Eddine Choutri, Mohamed Ghattassi, Nader Masmoudi, and Saif Eddin Jabari. Neural operators struggle to learn complex pdes in pedestrian mobility: Hughes model case study.Artificial Intelligence for Transportation, 1:100005, 2025. ISSN 3050-8606. doi: 10.1016/j.ait.2025.100005

  4. [4]

    Nonlinear re- construction for operator learning of pdes with discontinuities.International Conference on Learning Representations (ICLR), 2023

    Samuel Lanthaler, Roberto Molinaro, Patrik Hadorn, and Siddhartha Mishra. Nonlinear re- construction for operator learning of pdes with discontinuities.International Conference on Learning Representations (ICLR), 2023. doi: 10.48550/arXiv.2210.01074

  5. [5]

    Shift-deeponet: Extending deep operator networks for discontinuous output functions.Master’s Thesis, ETH Zurich, 2022

    Patrik Simon Hadorn. Shift-deeponet: Extending deep operator networks for discontinuous output functions.Master’s Thesis, ETH Zurich, 2022. doi: 10.3929/ethz-b-000539793

  6. [6]

    Seidman, Georgios Kissas, Paris Perdikaris, and George J

    Jacob H. Seidman, Georgios Kissas, Paris Perdikaris, and George J. Pappas. Nomad: Nonlinear manifold decoders for operator learning.Advances in Neural Information Processing Systems (NeurIPS), 35, 2022. doi: 10.48550/arXiv.2206.03551

  7. [7]

    SVD perspectives for augmenting DeepONet flexibility and interpretability

    Simone Venturi et al. Svd perspectives for augmenting deeponet flexibility and interpretability. Computer Methods in Applied Mechanics and Engineering, 404:115782, 2023. ISSN 0045-7825. doi: 10.1016/j.cma.2022.115718

  8. [8]

    Jae Yong Lee, Sung Woong Cho, and Hyung Ju Hwang. Hyperdeeponet: Learning operator with complex target function space using limited resources via hypernetwork.Proceedings of the International Conference on Learning Representations (ICLR), 2023. doi: 10.48550/arXiv. 2312.15949

  9. [9]

    2025 , publisher =

    Yue Chang, Mengfei Liu, Zhecheng Wang, Peter Yichen Chen, and Eitan Grinspun. Lifting the winding number: Precise discontinuities in neural fields for physics simulation.ACM SIGGRAPH 2025 Conference Papers, pages 1–11, 2025. doi: 10.1145/3721238.3730597

  10. [10]

    Simula SpringerBriefs on Computing

    Karoline Horgmo Jæger and Aslak Tveito.Differential Equations for Studies in Computational Electrophysiology. Simula SpringerBriefs on Computing. Springer Nature Switzerland, 2023. doi: 10.1007/978-3-031-30852-9

  11. [11]

    On the gibbs phenomenon and its resolution.SIAM Review, 39(4):644–668, 1997

    David Gottlieb and Chi-Wang Shu. On the gibbs phenomenon and its resolution.SIAM Review, 39(4):644–668, 1997. doi: 10.1137/S0036144596301390

  12. [12]

    Chakravarthy

    Ami Harten, Bjorn Engquist, Stanley Osher, and Sukumar R. Chakravarthy. Uniformly high order accurate essentially non-oscillatory schemes, iii.Journal of Computational Physics, 131 (1):3–47, 1997. ISSN 0021-9991. doi: 10.1006/jcph.1996.5632

  13. [13]

    Sussman, P

    Xu-Dong Liu, Stanley Osher, and Tony Chan. Weighted essentially non-oscillatory schemes. Journal of Computational Physics, 115(1):200–212, 1994. ISSN 0021-9991. doi: 10.1006/jcph. 1994.1187

  14. [14]

    Beck, Jonas Zeifang, Anna Schwarz, and David G

    Andrea D. Beck, Jonas Zeifang, Anna Schwarz, and David G. Flad. A neural network based shock detection and localization approach for discontinuous galerkin methods.Journal of Computational Physics, 423:109824, 2020. ISSN 0021-9991. doi: 10.1016/j.jcp.2020.109824

  15. [15]

    Karniadakis, and Rolf Krause

    Alena Kopani ˇcáková, Hardik Kothari, George E. Karniadakis, and Rolf Krause. Enhanc- ing training of physics-informed neural networks using domain decomposition–based pre- conditioning strategies.SIAM Journal on Scientific Computing, 46(5):S46–S67, 2024. doi: 10.1137/23M1583375. 10

  16. [16]

    Jagtap, Ehsan Kharazmi, and George Em Karniadakis

    Ameya D. Jagtap, Ehsan Kharazmi, and George Em Karniadakis. Conservative physics-informed neural networks on discrete domains for conservation laws: Applications to forward and inverse problems.Computer Methods in Applied Mechanics and Engineering, 365:113028, 2020. ISSN 0045-7825. doi: 10.1016/j.cma.2020.113028

  17. [17]

    A discontinuity capturing shallow neural network for elliptic interface problems.Journal of Computational Physics, 469:111576, 2022

    Wei-Fan Hu, Te-Sheng Lin, and Ming-Chih Lai. A discontinuity capturing shallow neural network for elliptic interface problems.Journal of Computational Physics, 469:111576, 2022. ISSN 0021-9991. doi: 10.1016/j.jcp.2022.111576

  18. [18]

    Denns: Discontinuity-embedded neural networks for fracture mechanics.Computer Methods in Applied Mechanics and Engineering, 446:118184, 2025

    Luyang Zhao and Qian Shao. Denns: Discontinuity-embedded neural networks for fracture mechanics.Computer Methods in Applied Mechanics and Engineering, 446:118184, 2025. ISSN 0045-7825. doi: 10.1016/j.cma.2025.118184

  19. [19]

    $\phi-$DeepONet: A Discontinuity Capturing Neural Operator

    Sumanta Roy, Stephen T. Castonguay, Pratanu Roy, and Michael D. Shields.ϕ−deeponet: A discontinuity capturing neural operator. 2026. doi: 10.48550/arXiv.2604.08076

  20. [20]

    Chen, Vivek Ooomen, Jérôme Darbon, and George Em Karniadakis

    Juan Diego Toscano, Daniel T. Chen, Vivek Ooomen, Jérôme Darbon, and George Em Karniadakis. A variational framework for residual-based adaptivity in neural pde solvers and operator learning.npj Artificial Intelligence, 2(1):32, 2026. ISSN 3005-1460. doi: 10.1038/s44387-026-00084-4

  21. [22]

    Linear optimization over per- mutation groups

    Yuling Jiao, Di Li, Xiliang Lu, Jerry Zhijian Yang, and Cheng Yuan. A gaussian mixture distribution-based adaptive sampling method for physics-informed neural networks.Engineering Applications of Artificial Intelligence, 135:108770, 2024. ISSN 0952-1976. doi: 10.1016/j. engappai.2024.108770

  22. [23]

    R-adaptive deeponet: Learning solution operators for pdes with discontinuous solutions using an r-adaptive strategy

    Yameng Zhu, Jingrun Chen, and Weibing Deng. R-adaptive deeponet: Learning solution operators for pdes with discontinuous solutions using an r-adaptive strategy. 2024. doi: 10.48550/arXiv.2408.04157

  23. [24]

    Tapas Tripura and Souvik Chakraborty. Wavelet neural operator for solving parametric partial differential equations in computational mechanics problems.Computer Methods in Applied Mechanics and Engineering, 404:115783, 2023. ISSN 0045-7825. doi: 10.1016/j.cma.2022. 115783

  24. [25]

    2025.doi: 10.48550/arXiv.2511

    Giorrgio M. Cavallazzi, Miguel Perex Cuadrado, and Alfredo Pinelli. Walsh-hadamard neural operators for solving pdes with discontinuous coefficients. 2025. doi: 10.48550/arXiv.2511. 07347

  25. [26]

    Derivative-informed neural operator: An efficient framework for high-dimensional parametric derivative learning

    Thomas O’Leary-Roseberry, Peng Chen, Umberto Villa, and Omar Ghattas. Derivative-informed neural operator: An efficient framework for high-dimensional parametric derivative learning. Journal of Computational Physics, 496:112555, 2024. doi: 10.1016/j.jcp.2023.112555

  26. [27]

    Derivative-enhanced deep operator network.Ad- vances in Neural Information Processing Systems (NeurIPS), 2024

    Yuan Qiu, Nolan Bridges, and Peng Chen. Derivative-enhanced deep operator network.Ad- vances in Neural Information Processing Systems (NeurIPS), 2024. doi: 10.52202/079017-0660

  27. [28]

    Discontinuous galerkin finite element operator network for solving non-smooth pdes

    Kapil Chawla, Youngjoon Hong, Jae Yong Lee, and Sanghyun Lee. Discontinuous galerkin finite element operator network for solving non-smooth pdes. 2026. doi: 10.48550/arXiv.2601.03668

  28. [29]

    Shock-aware physics-guided fusion-deeponet operator for rarefied micro-nozzle flows

    Ehsan Roohi and Amirmehran Mahdavi. Shock-aware physics-guided fusion-deeponet operator for rarefied micro-nozzle flows. 2025. doi: 10.48550/arXiv.2510.17887

  29. [30]

    Fusion-deeponet: A data-efficient neural operator for geometry-dependent hypersonic and supersonic flows

    Ahmad Peyvan, Varun Kumar, and George Em Karniadakis. Fusion-deeponet: A data-efficient neural operator for geometry-dependent hypersonic and supersonic flows. 2025. doi: 10.48550/ arXiv.2501.01934

  30. [31]

    Robust inside-outside segmentation using generalized winding numbers.ACM Transactions on Graphics, 32(4):33:1–33:12, July

    Alec Jacobson, Ladislav Kavan, and Olga Sorkine-Hornung. Robust inside-outside segmentation using generalized winding numbers.ACM Transactions on Graphics, 32(4):33:1–33:12, July

  31. [32]

    doi: 10.1145/2461912.2461916. 11

  32. [33]

    Notes on Burger’s Equation

    Maria Cameron. Notes on Burger’s Equation. Lecture Notes, University of Maryland Mathemat- ics Department, February 2024. URL https://www.math.umd.edu/~mariakc/burgers. pdf. 12 A Compared method architectures Here, we briefly introduce the architectures used for comparison. These architectures are state-of-the- art and have been proposed in the literature...