Physics-Aligned Canonical Equivariant Fourier Neural Operator under Symmetry-Induced Shifts
Pith reviewed 2026-05-20 12:25 UTC · model grok-4.3
The pith
A neural operator estimates the symmetry frame of a PDE input, maps it to a canonical reference, runs standard Fourier layers, and maps the output back to improve generalization under shifts.
A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.
Core claim
PACE-FNO estimates the input frame with a Lie-algebra coordinate estimator, maps the field to a reference frame, applies a standard Fourier Neural Operator, and restores the prediction to the target frame. Equivariance is enforced solely by the input and output transformations while the FNO architecture remains unchanged. The model is trained jointly on alignment and operator prediction using bounded symmetry perturbations, with an optional low-dimensional refinement step that updates the estimated frame at inference. On 1-D and 2-D Burgers, shallow-water, and Navier-Stokes equations, this produces in-distribution accuracy comparable to standard neural operators and reduces out-of-distibuti
What carries the argument
Physics-Aligned Canonical Equivariant Fourier Neural Operator (PACE-FNO), which uses a Lie-algebra coordinate estimator to align the input field to a canonical reference frame before standard FNO processing and restores the output frame afterward.
If this is right
- Aligning the input and restoring the output frame account for the majority of out-of-distribution error reduction under translations and Galilean shifts.
- Optional inference-time refinement of the estimated frame supplies a smaller additional correction.
- The approach preserves full equivariance without any architectural change to the underlying Fourier Neural Operator.
- Gains are largest for pure translation and Galilean shifts and smaller for coupled rotation-translation shifts.
Where Pith is reading between the lines
- The same separation of alignment from evolution could be tested on non-periodic domains by replacing the Lie-algebra estimator with a boundary-aware symmetry detector.
- Combining PACE-style alignment with other base operators such as graph or attention layers might extend the OOD improvements to irregular meshes.
- If symmetry estimation proves stable across time steps, the method could support long-horizon rollouts where frame drift would otherwise accumulate.
Load-bearing premise
The governing PDEs must admit continuous symmetries such as translations or Galilean boosts that can be estimated reliably from a single snapshot.
What would settle it
Run PACE-FNO on a PDE whose solutions lack continuous symmetries or on data where single-snapshot symmetry estimation error exceeds the bounded perturbations used in training; if OOD relative error remains comparable to standard FNO with augmentation, the alignment separation provides no gain.
Figures
read the original abstract
Neural operators approximate PDE solution maps, but they need not respect the symmetries of the governing equation. In out-of-distribution (OOD) regimes, a standard neural operator must often learn coordinate alignment and physical evolution within a single map, which can hurt generalization. We use known continuous symmetries of evolution equations on periodic domains to separate these two roles. We propose the Physics-Aligned Canonical Equivariant Fourier Neural Operator (PACE-FNO), which estimates the input frame with a Lie-algebra coordinate estimator, maps the field to a reference frame, applies a standard Fourier Neural Operator (FNO), and restores the prediction to the target frame. We train alignment and operator prediction jointly using bounded symmetry perturbations, with an optional low-dimensional refinement step that updates the estimated frame at inference. Equivariance is enforced by the input and output transformations, while the FNO architecture remains unchanged. Across 1-D and 2-D Burgers, shallow-water, and Navier-Stokes equations on periodic domains, PACE-FNO matches the in-distribution (ID) accuracy of standard neural operators and reduces out-of-distribution (OOD) relative error by up to 12x over FNO with symmetry augmentation (FNO+Aug) under translations and Galilean shifts, with smaller gains for coupled rotation-translation shifts. Ablations show that aligning the input and restoring the output frame account for most OOD gains; inference-time refinement provides a smaller correction.
Editorial analysis
A structured set of objections, weighed in public.
Referee Report
Summary. The manuscript introduces the Physics-Aligned Canonical Equivariant Fourier Neural Operator (PACE-FNO). It estimates an input frame via a Lie-algebra coordinate estimator, maps the field to a reference frame, applies an unmodified Fourier Neural Operator, and restores the prediction to the target frame. The approach is trained jointly using bounded symmetry perturbations (with optional inference-time refinement) and is evaluated on 1-D/2-D Burgers, shallow-water, and Navier-Stokes equations on periodic domains. It claims to match in-distribution accuracy of standard neural operators while reducing out-of-distribution relative error by up to 12x versus FNO with symmetry augmentation under translations and Galilean shifts (smaller gains for coupled rotation-translation shifts), with ablations attributing most OOD gains to the alignment and restoration steps.
Significance. If the central claims hold, the work demonstrates a practical way to exploit known continuous symmetries to separate coordinate alignment from physical evolution inside neural operators, improving OOD generalization without architectural changes to the FNO core. The joint training protocol, the optional refinement step, and the ablations that isolate the contribution of alignment/restoration are concrete strengths that support the reported error reductions.
major comments (2)
- [Training protocol and OOD evaluation (abstract and §4)] Training uses only bounded random symmetry perturbations, yet OOD tests apply larger translations, Galilean boosts, and coupled rotations that exceed those bounds. If the Lie-algebra estimator error grows with shift size, the FNO receives inputs that deviate from the true canonical frame, so the up-to-12x OOD error reduction cannot be attributed solely to the separation of alignment and evolution. Report frame-estimation error on the OOD test sets or provide a robustness analysis for shifts beyond the training perturbation magnitude.
- [Ablation studies] The ablation results isolate the alignment/restoration steps as the source of most OOD gains, but these ablations are performed within the same bounded-perturbation regime used for training. They do not directly verify estimator accuracy under the larger OOD shifts of the main experiments, leaving the load-bearing attribution of the 12x improvement partially untested.
minor comments (2)
- [Experimental setup] Exact training hyperparameters, optimizer settings, and the precise architecture of the low-dimensional refinement step are not fully detailed, limiting reproducibility.
- [Method description] Clarify the mathematical definition of the Lie-algebra coordinate estimator (including how the frame is represented and restored) and any assumptions on the domain periodicity.
Simulated Author's Rebuttal
We thank the referee for the constructive feedback on our manuscript. The comments highlight important aspects of our training protocol and evaluation strategy. We address each major comment below and will incorporate revisions to strengthen the paper.
read point-by-point responses
-
Referee: [Training protocol and OOD evaluation (abstract and §4)] Training uses only bounded random symmetry perturbations, yet OOD tests apply larger translations, Galilean boosts, and coupled rotations that exceed those bounds. If the Lie-algebra estimator error grows with shift size, the FNO receives inputs that deviate from the true canonical frame, so the up-to-12x OOD error reduction cannot be attributed solely to the separation of alignment and evolution. Report frame-estimation error on the OOD test sets or provide a robustness analysis for shifts beyond the training perturbation magnitude.
Authors: We agree that explicitly reporting frame-estimation accuracy on the OOD test sets would strengthen the attribution of gains to the alignment mechanism. The joint training protocol is designed such that the estimator learns to predict Lie-algebra coordinates under bounded perturbations while the overall system (alignment + FNO + restoration) generalizes to larger shifts, as evidenced by the reported OOD improvements. The optional inference-time refinement step further mitigates potential estimator drift for larger shifts. In the revised manuscript, we will add a new subsection with tables reporting the frame-estimation error (e.g., mean L2 distance between estimated and ground-truth frame parameters) on all OOD test sets for translations, Galilean shifts, and coupled rotation-translation cases. This will directly address whether estimator error remains controlled beyond training bounds and clarify the contribution of alignment versus refinement. revision: yes
-
Referee: [Ablation studies] The ablation results isolate the alignment/restoration steps as the source of most OOD gains, but these ablations are performed within the same bounded-perturbation regime used for training. They do not directly verify estimator accuracy under the larger OOD shifts of the main experiments, leaving the load-bearing attribution of the 12x improvement partially untested.
Authors: The ablations were performed in the bounded regime to provide a controlled isolation of the alignment and restoration contributions under the exact training distribution. However, we acknowledge that extending this verification to the larger OOD shifts would more rigorously support the attribution in the main results. We will revise the ablation section to include frame-estimation error metrics computed on the OOD test sets (using the same ablation variants), allowing direct comparison of estimator performance under larger shifts. This addition will confirm that the OOD gains remain attributable to the alignment/restoration steps even when shifts exceed training bounds. revision: yes
Circularity Check
No significant circularity in the claimed separation of alignment and evolution
full rationale
The paper separates symmetry alignment (via a jointly trained Lie-algebra coordinate estimator that produces an explicit frame transformation) from the unchanged FNO evolution operator. Training uses bounded random symmetry perturbations, but OOD evaluation applies held-out larger translations, Galilean shifts, and rotations that were never seen during fitting. The reported error reductions are therefore measured on genuinely unseen inputs rather than quantities that reduce to the training fit by construction. No self-definitional steps, fitted parameters renamed as predictions, or load-bearing self-citations appear in the derivation; the central claim remains an architectural and empirical separation that is independently testable against external benchmarks.
Axiom & Free-Parameter Ledger
free parameters (1)
- bound on symmetry perturbation magnitude
axioms (1)
- domain assumption The PDE admits continuous symmetries (translations, Galilean boosts) that leave the equation invariant on periodic domains.
Reference graph
Works this paper leans on
-
[1]
Kevrekidis, Lu Lu, Paris Perdikaris, Sifan Wang, and Liu Yang
George Em Karniadakis, Ioannis G. Kevrekidis, Lu Lu, Paris Perdikaris, Sifan Wang, and Liu Yang. Physics-informed machine learning.Nature Reviews Physics, 3(6):422–440, 2021
work page 2021
-
[2]
Fourier neural operator for parametric partial differen- tial equations
Zongyi Li, Nikola Kovachki, Kamyar Azizzadenesheli, Burigede Liu, Kaushik Bhattacharya, Andrew Stuart, and Anima Anandkumar. Fourier neural operator for parametric partial differen- tial equations. InInternational Conference on Learning Representations, 2021
work page 2021
-
[3]
Lu Lu, Pengzhan Jin, and George Em Karniadakis. DeepONet: Learning nonlinear operators for identifying differential equations based on the universal approximation theorem of operators. Nature Machine Intelligence, 3:218–229, 2021
work page 2021
-
[4]
Nikola Kovachki, Zongyi Li, Burigede Liu, Kamyar Azizzadenesheli, Kaushik Bhattacharya, Andrew Stuart, and Anima Anandkumar. Neural operator: Learning maps between function spaces with applications to PDEs.Journal of Machine Learning Research, 24(89):1–97, 2023
work page 2023
-
[5]
A mathematical guide to operator learning
Nicolas Boullé and Alex Townsend. A mathematical guide to operator learning. InHandbook of Numerical Analysis, volume 25, pages 83–125. Elsevier, 2024. doi: 10.1016/bs.hna.2024.05. 003
-
[6]
Gege Wen, Zongyi Li, Kamyar Azizzadenesheli, Anima Anandkumar, and Sally M. Benson. U-FNO: An enhanced Fourier neural operator-based deep-learning model for multiphase flow. Advances in Water Resources, 163:104180, 2022
work page 2022
-
[7]
Jaideep Pathak, Shashank Subramanian, Peter Harrington, Sanjeev Raja, Ashesh Chattopadhyay, Morteza Mardani, Thorsten Kurth, David Hall, Zongyi Li, Kamyar Azizzadenesheli, Pedram Hassanzadeh, Karthik Kashinath, and Anima Anandkumar. FourCastNet: A global data- driven high-resolution weather model using adaptive Fourier neural operators.arXiv preprint arXi...
work page internal anchor Pith review Pith/arXiv arXiv 2022
-
[8]
Message passing neural PDE solvers
Johannes Brandstetter, Daniel Worrall, and Max Welling. Message passing neural PDE solvers. InInternational Conference on Learning Representations, 2022
work page 2022
-
[9]
PDE-refiner: Achieving accurate long rollouts with neural PDE solvers
Phillip Lippe, Bastiaan Veeling, Paris Perdikaris, Richard Turner, and Johannes Brandstetter. PDE-refiner: Achieving accurate long rollouts with neural PDE solvers. InAdvances in Neural Information Processing Systems, volume 36, pages 67398–67433, 2023
work page 2023
-
[10]
Olver.Applications of Lie Groups to Differential Equations, volume 107
Peter J. Olver.Applications of Lie Groups to Differential Equations, volume 107. Springer Science & Business Media, 1993
work page 1993
-
[11]
Geometric Deep Learning: Grids, Groups, Graphs, Geodesics, and Gauges
Michael M. Bronstein, Joan Bruna, Taco Cohen, and Petar Veliˇckovi´c. Geometric deep learning: Grids, groups, graphs, geodesics, and gauges.arXiv preprint arXiv:2104.13478, 2021
work page internal anchor Pith review Pith/arXiv arXiv 2021
-
[12]
Marc Finzi, Samuel Stanton, Pavel Izmailov, and Andrew Gordon Wilson. Generalizing convolutional neural networks for equivariance to Lie groups on arbitrary continuous data. In International Conference on Machine Learning, pages 3165–3176. PMLR, 2020
work page 2020
-
[13]
Alias-free mamba neural operator
Jianwei Zheng, Wei Li, Ni Xu, Junwei Zhu, Xiaoxu Lin, and Xiaoqin Zhang. Alias-free mamba neural operator. InAdvances in Neural Information Processing Systems, volume 37, pages 52962–52995, 2024
work page 2024
-
[14]
Lie algebra canonicalization: Equivariant neural operators under arbitrary Lie groups
Zakhar Shumaylov, Pavel Zaika, Jacob Rowbottom, Ferdia Sherry, Melanie Weber, and Carola- Bibiane Schönlieb. Lie algebra canonicalization: Equivariant neural operators under arbitrary Lie groups. InInternational Conference on Learning Representations, 2025
work page 2025
-
[15]
Omri Puny, Matan Atzmon, Heli Ben-Hamu, Ishan Misra, Aditya Grover, Edward J. Smith, and Yaron Lipman. Frame averaging for invariant and equivariant network design. InInternational Conference on Learning Representations, 2022
work page 2022
-
[16]
Equivariance with learned canonicalization functions
Sékou-Oumar Kaba, Arnab Kumar Mondal, Yan Zhang, Yoshua Bengio, and Siamak Ravan- bakhsh. Equivariance with learned canonicalization functions. InInternational Conference on Machine Learning, pages 15546–15566. PMLR, 2023. 10
work page 2023
-
[17]
A canonicalization perspective on invariant and equivariant learning
George Ma, Yifei Wang, Derek Lim, Stefanie Jegelka, and Yisen Wang. A canonicalization perspective on invariant and equivariant learning. InAdvances in Neural Information Processing Systems, volume 37, 2024
work page 2024
- [18]
-
[19]
Rui Wang, Robin Walters, and Rose Yu. Incorporating symmetry into deep dynamics models for improved generalization.arXiv preprint arXiv:2002.03061, 2020
-
[20]
Johannes Brandstetter, Max Welling, and Daniel E. Worrall. Lie point symmetry data aug- mentation for neural PDE solvers. InInternational Conference on Machine Learning, pages 2241–2256. PMLR, 2022
work page 2022
-
[21]
Lie point symmetry and physics-informed networks
Tara Akhound-Sadegh, Laurence Perreault-Levasseur, Johannes Brandstetter, Max Welling, and Siamak Ravanbakhsh. Lie point symmetry and physics-informed networks. InAdvances in Neural Information Processing Systems, volume 36, pages 42468–42481, 2023
work page 2023
-
[22]
Group equivariant convolutional networks
Taco Cohen and Max Welling. Group equivariant convolutional networks. InInternational Conference on Machine Learning, pages 2990–2999. PMLR, 2016
work page 2016
-
[23]
Bart M. N. Smets, Jim Portegies, Erik J. Bekkers, and Remco Duits. PDE-based group equivariant convolutional neural networks.Journal of Mathematical Imaging and Vision, 65: 209–239, 2023
work page 2023
-
[24]
Learning mesh- based simulation with graph networks
Tobias Pfaff, Meire Fortunato, Alvaro Sanchez-Gonzalez, and Peter Battaglia. Learning mesh- based simulation with graph networks. InInternational Conference on Learning Representations, 2021
work page 2021
-
[25]
Multipole graph neural operator for parametric partial differential equations
Zongyi Li, Nikola Kovachki, Kamyar Azizzadenesheli, Burigede Liu, Andrew Stuart, Kaushik Bhattacharya, and Anima Anandkumar. Multipole graph neural operator for parametric partial differential equations. InAdvances in Neural Information Processing Systems, volume 33, pages 6755–6766, 2020
work page 2020
-
[26]
Choose a transformer: Fourier or Galerkin
Shuhao Cao. Choose a transformer: Fourier or Galerkin. InAdvances in Neural Information Processing Systems, volume 34, pages 15437–15450, 2021
work page 2021
-
[27]
Tapas Tripura and Souvik Chakraborty. Wavelet neural operator for solving parametric partial differential equations in computational mechanics problems.Computer Methods in Applied Mechanics and Engineering, 404:115783, 2023
work page 2023
-
[28]
Qianying Cao, Somdatta Goswami, and George Em Karniadakis. Laplace neural operator for solving differential equations.Nature Machine Intelligence, 6:631–640, 2024
work page 2024
-
[29]
Kirillov.An Introduction to Lie Groups and Lie Algebras, volume 113
Alexander A. Kirillov.An Introduction to Lie Groups and Lie Algebras, volume 113. Cambridge University Press, 2008
work page 2008
-
[30]
John M. Lee. Smooth manifolds. InIntroduction to Smooth Manifolds, pages 1–29. Springer, 2003
work page 2003
-
[31]
Group equivariant Fourier neural operators for partial differential equations
Jacob Helwig, Xuan Zhang, Cong Fu, Jerry Kurtin, Stephan Wojtowytsch, and Shuiwang Ji. Group equivariant Fourier neural operators for partial differential equations. InInternational Conference on Machine Learning, pages 13376–13411. PMLR, 2023
work page 2023
-
[32]
Factorized Fourier neural operators
Alasdair Tran, Alexander Mathews, Lexing Xie, and Cheng Soon Ong. Factorized Fourier neural operators. InInternational Conference on Learning Representations, 2023
work page 2023
-
[33]
Alan W. Paeth. A fast algorithm for general raster rotation. InGraphics Gems, pages 179–195. Academic Press, 1990
work page 1990
-
[34]
Michael Unser, Philippe Thevenaz, and Leonid Yaroslavsky. Convolution-based interpolation for fast, high-quality rotation of images.IEEE Transactions on Image Processing, 4(10): 1371–1381, 1995. 11
work page 1995
-
[35]
Sifan Wang, Yujun Teng, and Paris Perdikaris. Understanding and mitigating gradient flow pathologies in physics-informed neural networks.SIAM Journal on Scientific Computing, 43 (5):A3055–A3081, 2021
work page 2021
-
[36]
GradNorm: Gra- dient normalization for adaptive loss balancing in deep multitask networks
Zhao Chen, Vijay Badrinarayanan, Chen-Yu Lee, and Andrew Rabinovich. GradNorm: Gra- dient normalization for adaptive loss balancing in deep multitask networks. InInternational Conference on Machine Learning, pages 794–803. PMLR, 2018
work page 2018
-
[37]
Hamprecht, Yoshua Bengio, and Aaron Courville
Nasim Rahaman, Aristide Baratin, Devansh Arpit, Felix Draxler, Min Lin, Fred A. Hamprecht, Yoshua Bengio, and Aaron Courville. On the spectral bias of neural networks. InInternational Conference on Machine Learning, pages 5301–5310. PMLR, 2019
work page 2019
-
[38]
Zhi-Qin John Xu, Yaoyu Zhang, Tao Luo, Yanyang Xiao, and Zheng Ma. Frequency principle: Fourier analysis sheds light on deep neural networks.Communications in Computational Physics, 28(5):1746–1767, 2020
work page 2020
-
[39]
Du, Ken-ichi Kawarabayashi, and Stefanie Jegelka
Keyulu Xu, Mozhi Zhang, Jingling Li, Simon S. Du, Ken-ichi Kawarabayashi, and Stefanie Jegelka. How neural networks extrapolate: From feedforward to graph neural networks. In International Conference on Learning Representations, 2021
work page 2021
-
[40]
Tent: Fully test-time adaptation by entropy minimization
Dequan Wang, Evan Shelhamer, Shaoteng Liu, Bruno Olshausen, and Trevor Darrell. Tent: Fully test-time adaptation by entropy minimization. InInternational Conference on Learning Representations, 2021
work page 2021
-
[41]
Yu Sun, Xiaolong Wang, Zhuang Liu, John Miller, Alexei A. Efros, and Moritz Hardt. Test-time training with self-supervision for generalization under distribution shifts. InInternational Conference on Machine Learning, pages 9229–9248. PMLR, 2020
work page 2020
-
[42]
TeST: Test-time self- training under distribution shift
Samarth Sinha, Peter Gehler, Francesco Locatello, and Bernt Schiele. TeST: Test-time self- training under distribution shift. InIEEE/CVF Winter Conference on Applications of Computer Vision, 2022
work page 2022
-
[43]
James Kirkpatrick, Razvan Pascanu, Neil Rabinowitz, Joel Veness, Guillaume Desjardins, An- drei A. Rusu, Kieran Milan, John Quan, Tiago Ramalho, Agnieszka Grabska-Barwinska, Demis Hassabis, Claudia Clopath, Dharshan Kumaran, and Raia Hadsell. Overcoming catastrophic forgetting in neural networks.Proceedings of the National Academy of Sciences, 114(13): 35...
work page 2017
-
[44]
Robert M. French. Catastrophic forgetting in connectionist networks.Trends in Cognitive Sciences, 3(4):128–135, 1999
work page 1999
-
[45]
PDEBench: An extensive benchmark for scientific machine learning
Makoto Takamoto, Timothy Praditia, Raphael Leiteritz, Dan MacKinlay, Francesco Alesiani, Dirk Pflüger, and Mathias Niepert. PDEBench: An extensive benchmark for scientific machine learning. InAdvances in Neural Information Processing Systems, volume 35, 2022
work page 2022
-
[46]
Learning chaotic dynamics in dissipative systems
Zongyi Li, Miguel Liu-Schiaffini, Nikola Kovachki, Kamyar Azizzadenesheli, Burigede Liu, Kaushik Bhattacharya, Andrew Stuart, and Anima Anandkumar. Learning chaotic dynamics in dissipative systems. InAdvances in Neural Information Processing Systems, volume 35, pages 16768–16781, 2022
work page 2022
-
[47]
Zongyi Li, Daniel Zhengyu Huang, Burigede Liu, and Anima Anandkumar. Fourier neural operator with learned deformations for PDEs on general geometries.Journal of Machine Learning Research, 24(388):1–26, 2023
work page 2023
-
[48]
Spherical Fourier neural operators: Learning stable dynamics on the sphere
Boris Bonev, Thorsten Kurth, Christian Hundt, Jaideep Pathak, Maximilian Baust, Karthik Kashinath, and Anima Anandkumar. Spherical Fourier neural operators: Learning stable dynamics on the sphere. InInternational Conference on Machine Learning, pages 2806–2823. PMLR, 2023
work page 2023
-
[49]
Stephen B. Pope.Turbulent Flows. Cambridge University Press, 2000
work page 2000
-
[50]
Arnold Sommerfeld.Mechanics: Lectures on Theoretical Physics, volume 1. Academic Press, 1952. 12
work page 1952
-
[51]
Explorations in Homeomorphic Variational Auto-Encoding
Luca Falorsi, Pim de Haan, Tim R. Davidson, Nicola De Cao, Maurice Weiler, Patrick Forré, and Taco S. Cohen. Explorations in homeomorphic variational auto-encoding.arXiv preprint arXiv:1807.04689, 2018
work page internal anchor Pith review Pith/arXiv arXiv 2018
-
[52]
Bracewell.The Fourier Transform and Its Applications
Ronald N. Bracewell.The Fourier Transform and Its Applications. McGraw-Hill, New York, 1986
work page 1986
-
[53]
Linear convergence of gradient and proximal- gradient methods under the Polyak-Łojasiewicz condition
Hamed Karimi, Julie Nutini, and Mark Schmidt. Linear convergence of gradient and proximal- gradient methods under the Polyak-Łojasiewicz condition. InJoint European Conference on Machine Learning and Knowledge Discovery in Databases, pages 795–811. Springer, 2016. 13 A Additional methodological details This section gives the definitions and training detai...
work page 2016
discussion (0)
Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.