Adaptive Canonicalization with Application to Invariant Anisotropic Geometric Networks
Pith reviewed 2026-05-18 12:41 UTC · model grok-4.3
The pith
Selecting the canonical form that maximizes network predictive confidence produces continuous and symmetry-respecting models with universal approximation properties.
A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.
Core claim
Adaptive canonicalization based on prior maximization selects the canonical form of the input to maximize the predictive confidence of the network. We prove that this construction yields continuous and symmetry-respecting models that admit universal approximation properties. We propose two applications of our setting: resolving eigenbasis ambiguities in spectral graph neural networks, and handling rotational symmetries in point clouds. We empirically validate our methods on molecular and protein classification, as well as point cloud classification tasks. Our adaptive canonicalization outperforms the three other common solutions to equivariant machine learning: data augmentation, standard
What carries the argument
adaptive canonicalization based on prior maximization: the mechanism that chooses the input's canonical form to maximize the network's predictive confidence, thereby enforcing continuity and exact symmetry respect
If this is right
- The resulting models are continuous functions of the input while exactly respecting the symmetries of the data.
- These models can universally approximate any continuous function that is invariant under the given symmetry group.
- Eigenbasis ambiguities in spectral graph neural networks are resolved without introducing discontinuities in the mapping.
- Rotational symmetries in point cloud data are handled by selecting the orientation that maximizes network confidence.
- The approach achieves higher classification accuracy than data augmentation, standard canonicalization, or equivariant architectures on geometric datasets.
Where Pith is reading between the lines
- The adaptive selection could implicitly favor stable training trajectories by aligning canonical choices with regions of high network confidence.
- The same prior-maximization idea could be tested on other symmetry groups such as reflections or discrete permutations beyond graphs.
- Observing which canonical forms are chosen most often on a dataset might reveal how the network internally resolves geometric ambiguities.
- The framework might improve generalization when training data is limited, because the symmetry-respecting property is enforced exactly rather than approximately.
Load-bearing premise
Maximizing the network's predictive confidence over possible canonical forms always produces a selection that is continuous in the input and exactly respects symmetries, without needing extra restrictions on the network or loss landscape.
What would settle it
A concrete input graph or point cloud together with a small perturbation where the maximizing canonical form switches abruptly, causing the overall model output to become discontinuous or to violate the input symmetry.
Figures
read the original abstract
Canonicalization is a widely used strategy in equivariant machine learning, enforcing symmetry in neural networks by mapping each input to a standard form. Yet, it often introduces discontinuities that can affect stability during training, limit generalization, and complicate universal approximation theorems. In this paper, we address this by introducing adaptive canonicalization, a general framework in which the canonicalization depends both on the input and the network. Specifically, we present the adaptive canonicalization based on prior maximization, where the standard form of the input is chosen to maximize the predictive confidence of the network. We prove that this construction yields continuous and symmetry-respecting models that admit universal approximation properties. We propose two applications of our setting: (i) resolving eigenbasis ambiguities in spectral graph neural networks, and (ii) handling rotational symmetries in point clouds. We empirically validate our methods on molecular and protein classification, as well as point cloud classification tasks. Our adaptive canonicalization outperforms the three other common solutions to equivariant machine learning: data augmentation, standard canonicalization, and equivariant architectures.
Editorial analysis
A structured set of objections, weighed in public.
Referee Report
Summary. The paper introduces adaptive canonicalization based on prior maximization, in which the canonical representative of an input is chosen to maximize the network's predictive confidence. This construction is claimed to yield continuous, symmetry-respecting models that admit universal approximation. Two concrete applications are developed: resolving eigenbasis ambiguities in spectral graph neural networks and handling rotational symmetries for point clouds. Empirical results on molecular, protein, and point-cloud classification tasks show outperformance relative to data augmentation, standard canonicalization, and equivariant architectures.
Significance. If the continuity and universal-approximation claims are rigorously established, the framework would provide a practical route to symmetry enforcement that avoids both the discontinuities of fixed canonicalization and the architectural overhead of fully equivariant layers. The empirical gains on standard geometric benchmarks would be of immediate interest to practitioners in molecular modeling and 3D vision.
major comments (2)
- [Abstract and §3] Abstract and §3 (theoretical development): the central claim is that prior-maximization canonicalization produces a continuous map 'without requiring further restrictions on the network or loss landscape.' The argmax operator over a finite orbit is discontinuous wherever two candidates have equal or crossing confidence values. The manuscript must supply the precise lemma or selection rule (unique maximizer, continuous tie-breaking, or smoothing) that guarantees continuity of the resulting canonicalization map; without it the continuity and universal-approximation statements rest on an unstated assumption.
- [§4.1] §4.1 (eigenbasis application): the symmetry-respecting property is asserted after the adaptive choice, yet the proof sketch does not address whether the network's confidence function itself transforms equivariantly under the group action; a counter-example or explicit verification is needed to confirm that the selected eigenbasis is invariant under the original symmetry.
minor comments (2)
- [Table 1 and §5] Table 1 and §5: report the number of random seeds, standard deviations, and statistical tests for the claimed outperformance; current numbers appear to be single-run point estimates.
- [Notation] Notation: define 'predictive confidence' explicitly (e.g., max softmax probability, margin, or log-likelihood) at first use.
Simulated Author's Rebuttal
We thank the referee for the careful and constructive review. The comments highlight important points regarding the rigor of our continuity and symmetry claims. We address each major comment below with clarifications and proposed revisions.
read point-by-point responses
-
Referee: [Abstract and §3] Abstract and §3 (theoretical development): the central claim is that prior-maximization canonicalization produces a continuous map 'without requiring further restrictions on the network or loss landscape.' The argmax operator over a finite orbit is discontinuous wherever two candidates have equal or crossing confidence values. The manuscript must supply the precise lemma or selection rule (unique maximizer, continuous tie-breaking, or smoothing) that guarantees continuity of the resulting canonicalization map; without it the continuity and universal-approximation statements rest on an unstated assumption.
Authors: We agree that the argmax over a finite orbit is formally discontinuous at ties. Our §3 proof establishes continuity of the overall map by showing that the confidence function is continuous (as the network is continuous) and that discontinuities occur only on a lower-dimensional subset of the input space where two or more orbit elements achieve identical maximum confidence. To make this fully rigorous, we will add an explicit lemma in the revised §3 that introduces a deterministic, continuous tie-breaking rule: when multiple maximizers exist, select the representative whose canonical coordinates are closest (in Euclidean distance) to a fixed reference vector chosen once per orbit. This rule preserves the symmetry-respecting property and ensures the canonicalization map is continuous everywhere. We will also update the abstract to reference this lemma. Revision will be made. revision: yes
-
Referee: [§4.1] §4.1 (eigenbasis application): the symmetry-respecting property is asserted after the adaptive choice, yet the proof sketch does not address whether the network's confidence function itself transforms equivariantly under the group action; a counter-example or explicit verification is needed to confirm that the selected eigenbasis is invariant under the original symmetry.
Authors: We appreciate this request for explicit verification. In the eigenbasis application, the network (a spectral GNN) is applied after canonicalization, but the confidence score is computed from the network's output logits on the canonicalized graph. Because the underlying graph Laplacian commutes with the symmetry action, any group element g maps the orbit of possible eigenbases to itself. The maximizer of the confidence therefore selects a representative that is equivariant by construction: applying g to the input graph yields a correspondingly transformed maximizer, so the final selected eigenbasis (and thus the network output) remains invariant. We will add a short paragraph with this argument plus a brief counter-example check (a small cycle graph under rotation) to §4.1. This is a partial revision because the core invariance follows from the construction but requires the added verification paragraph. revision: partial
Circularity Check
No significant circularity; derivation relies on independent proof of properties for the defined construction
full rationale
The paper defines adaptive canonicalization via prior maximization (selecting the input form that maximizes the network's predictive confidence) and states that it proves the resulting models are continuous, symmetry-respecting, and universally approximating. This construction is presented as a general framework with applications to specific symmetries, supported by empirical validation. No quoted step reduces a claimed result to a fitted input, self-citation chain, or definitional tautology by construction. The central claims rest on a mathematical proof rather than renaming or smuggling assumptions, making the derivation self-contained against external benchmarks.
Axiom & Free-Parameter Ledger
axioms (1)
- domain assumption There exists a canonical form selection rule based on network confidence that is continuous and symmetry-preserving for the relevant group actions.
Reference graph
Works this paper leans on
-
[1]
Johannes Brandstetter, Rob Hesselink, Elise van der Pol, Erik J Bekkers, and Max Welling. Geometric and physical quantities improve E(3) equivariant message passing.arXiv preprint arXiv:2110.02905,
-
[2]
Xavier Bresson and Thomas Laurent. Residual gated graph ConvNets.arXiv preprint arXiv:1711.07553,
work page internal anchor Pith review Pith/arXiv arXiv
-
[3]
Geometric Deep Learning: Grids, Groups, Graphs, Geodesics, and Gauges
Michael M Bronstein, Joan Bruna, Taco Cohen, and Petar Veličković. Geometric deep learning: Grids, groups, graphs, geodesics, and gauges.arXiv preprint arXiv:2104.13478,
work page internal anchor Pith review Pith/arXiv arXiv
-
[4]
Taco S Cohen and Max Welling. Steerable CNNs.arXiv preprint arXiv:1612.08498, 2016b. Lynn A Cooper and Roger N Shepard. Chronometric studies of the rotation of mental images. InVisual information processing, pages 75–176. Elsevier,
work page internal anchor Pith review Pith/arXiv arXiv
-
[5]
Metric convolutions: A unifying theory to adaptive convolutions.arXiv preprint arXiv:2406.05400,
Thomas Dagès, Michael Lindenbaum, and Alfred M Bruckstein. Metric convolutions: A unifying theory to adaptive convolutions.arXiv preprint arXiv:2406.05400,
-
[6]
arXiv preprint arXiv:2312.07511 , year=
Alexandre Duval, Simon V Mathis, Chaitanya K Joshi, Victor Schmidt, Santiago Miret, Fragkiskos D Malliaros, Taco Cohen, Pietro Lio, Yoshua Bengio, and Michael Bronstein. A hitchhiker’s guide to geometric gnns for 3D atomic systems.arXiv preprint arXiv:2312.07511, 2023a. 12 Alexandre Agm Duval, Victor Schmidt, Alex Hernández-Garcıa, Santiago Miret, Fragkis...
-
[7]
Fabian Fuchs, Daniel Worrall, Volker Fischer, and Max Welling. SE(3)-transformers: 3D roto-translation equivariant attention networks.Advances in neural information processing systems, 33:1970–1981,
work page 1970
-
[8]
e3nn : E uclidean neural networks
Mario Geiger and Tess Smidt. e3nn: Euclidean neural networks.arXiv preprint arXiv:2207.09453,
-
[9]
Geometrically equivariant graph neural networks: A survey
Jiaqi Han, Yu Rong, Tingyang Xu, and Wenbing Huang. Geometrically equivariant graph neural networks: A survey. arXiv preprint arXiv:2202.07230,
-
[10]
Snir Hordan, Maya Bechler-Speicher, Gur Lifshitz, and Nadav Dym. Spectral graph neural networks are incomplete on graphs with a simple spectrum.arXiv preprint arXiv:2506.05530,
-
[11]
Ian T Jolliffe and Jorge Cadima. Principal component analysis: a review and recent developments.Philosophical transactions of the royal society A: Mathematical, Physical and Engineering Sciences, 374(2065):20150202,
work page 2065
-
[12]
Symmetry breaking and equivariant neural networks.arXiv preprint arXiv:2312.09016,
Sékou-Oumar Kaba and Siamak Ravanbakhsh. Symmetry breaking and equivariant neural networks.arXiv preprint arXiv:2312.09016,
-
[13]
Adam: A Method for Stochastic Optimization
Diederik P Kingma and Jimmy Ba. Adam: A method for stochastic optimization.arXiv preprint arXiv:1412.6980,
work page internal anchor Pith review Pith/arXiv arXiv
-
[14]
Semi-Supervised Classification with Graph Convolutional Networks
TN Kipf. Semi-supervised classification with graph convolutional networks.arXiv preprint arXiv:1609.02907,
work page internal anchor Pith review Pith/arXiv arXiv
-
[15]
Improving equivariant networks with probabilistic symmetry breaking.arXiv preprint arXiv:2503.21985,
Hannah Lawrence, Vasco Portilheiro, Yan Zhang, and Sékou-Oumar Kaba. Improving equivariant networks with probabilistic symmetry breaking.arXiv preprint arXiv:2503.21985,
-
[16]
14 Derek Lim, Joshua Robinson, Lingxiao Zhao, Tess Smidt, Suvrit Sra, Haggai Maron, and Stefanie Jegelka. Sign and basis invariant networks for spectral graph representation learning.arXiv preprint arXiv:2202.13013,
-
[17]
Ya-Wei Eileen Lin, Ronen Talmon, and Ron Levie. Equivariant machine learning on graphs with nonlinear spectral filters.Advances in Neural Information Processing Systems, 37:128182–128226, 2024a. Yuchao Lin, Jacob Helwig, Shurui Gui, and Shuiwang Ji. Equivariance via minimal frame averaging for more symmetries and efficiency.arXiv preprint arXiv:2406.07598...
-
[18]
Sohir Maskey, Ali Parviz, Maximilian Thiessen, Hannes Stärk, Ylli Sadikaj, and Haggai Maron. Generalized Laplacian positional encoding for graph representation learning.arXiv preprint arXiv:2210.15956,
-
[19]
Francesco Mezzadri. How to generate random matrices from the classical compact groups.arXiv preprint math- ph/0609050,
-
[20]
arXiv preprint arXiv:2007.08663 , year=
Christopher Morris, Nils M Kriege, Franka Bause, Kristian Kersting, Petra Mutzel, and Marion Neumann. TUDataset: A collection of benchmark datasets for learning with graphs.arXiv preprint arXiv:2007.08663,
-
[21]
Learning symmetric embeddings for equivariant world models.arXiv preprint arXiv:2204.11371,
Jung Yeon Park, Ondrej Biza, Linfeng Zhao, Jan Willem van de Meent, and Robin Walters. Learning symmetric embeddings for equivariant world models.arXiv preprint arXiv:2204.11371,
-
[22]
Global attention improves graph networks generalization.arXiv preprint arXiv:2006.07846,
Omri Puny, Heli Ben-Hamu, and Yaron Lipman. Global attention improves graph networks generalization.arXiv preprint arXiv:2006.07846,
-
[23]
Frame averaging for invariant and equivariant network design.arXiv preprint arXiv:2110.03336,
Omri Puny, Matan Atzmon, Heli Ben-Hamu, Ishan Misra, Aditya Grover, Edward J Smith, and Yaron Lipman. Frame averaging for invariant and equivariant network design.arXiv preprint arXiv:2110.03336,
-
[24]
Symmetry-Aware Generative Modeling through Learned Canonicalization
Kusha Sareen, Daniel Levy, Arnab Kumar Mondal, Sékou-Oumar Kaba, Tara Akhound-Sadegh, and Siamak Ravan- bakhsh. Symmetry-aware generative modeling through learned canonicalization.arXiv preprint arXiv:2501.07773,
work page internal anchor Pith review Pith/arXiv arXiv
-
[25]
Robust canonicalization through bootstrapped data re-alignment.arXiv preprint arXiv:2510.08178,
Johann Schmidt and Sebastian Stober. Robust canonicalization through bootstrapped data re-alignment.arXiv preprint arXiv:2510.08178,
-
[26]
Erik Henning Thiede, Truong Son Hy, and Risi Kondor. The general theory of permutation equivarant neural networks and higher order graph variational encoders.arXiv preprint arXiv:2004.03990,
-
[27]
Tensor field networks: Rotation- and translation-equivariant neural networks for 3D point clouds
Nathaniel Thomas, Tess Smidt, Steven Kearnes, Lusann Yang, Li Li, Kai Kohlhoff, and Patrick Riley. Tensor field networks: Rotation-and translation-equivariant neural networks for 3D point clouds.arXiv preprint arXiv:1802.08219,
work page internal anchor Pith review Pith/arXiv arXiv
-
[28]
Petar Veličković, Guillem Cucurull, Arantxa Casanova, Adriana Romero, Pietro Lio, and Yoshua Bengio. Graph attention networks.arXiv preprint arXiv:1710.10903,
work page internal anchor Pith review Pith/arXiv arXiv
-
[29]
Rui Wang, Elyssa Hofgard, Han Gao, Robin Walters, and Tess E Smidt. Discovering symmetry breaking in physical systems with relaxed group convolution.arXiv preprint arXiv:2310.02299,
-
[30]
Maurice Weiler, Mario Geiger, Max Welling, Wouter Boomsma, and Taco S Cohen. 3D steerable CNNs: Learning rotationally equivariant features in volumetric data.Advances in Neural information processing systems, 31, 2018a. Maurice Weiler, Fred A Hamprecht, and Martin Storath. Learning steerable filters for rotation equivariant CNNs. In Proceedings of the IEE...
-
[31]
3D ShapeNets: A deep representation for volumetric shapes
Zhirong Wu, Shuran Song, Aditya Khosla, Fisher Yu, Linguang Zhang, Xiaoou Tang, and Jianxiong Xiao. 3D ShapeNets: A deep representation for volumetric shapes. InProceedings of the IEEE conference on computer vision and pattern recognition, pages 1912–1920,
work page 1912
-
[32]
How Powerful are Graph Neural Networks?
Keyulu Xu, Weihua Hu, Jure Leskovec, and Stefanie Jegelka. How powerful are graph neural networks?arXiv preprint arXiv:1810.00826,
work page internal anchor Pith review Pith/arXiv arXiv
-
[33]
Learning Representations of Sets through Optimized Permutations
Yan Zhang, Jonathon Hare, and Adam Prügel-Bennett. Learning representations of sets through optimized permuta- tions.arXiv preprint arXiv:1812.03928,
work page internal anchor Pith review Pith/arXiv arXiv
-
[34]
Fspool: Learning set representations with featurewise sort pooling.arXiv preprint arXiv:1906.02795,
Yan Zhang, Jonathon Hare, and Adam Prügel-Bennett. FSPool: Learning set representations with featurewise sort pooling.arXiv preprint arXiv:1906.02795, 2019a. Zhen Zhang, Jiajun Bu, Martin Ester, Jianfeng Zhang, Chengwei Yao, Zhi Yu, and Can Wang. Hierarchical graph pooling with structure learning.arXiv preprint arXiv:1911.05954, 2019b. 18 Appendix A Relat...
-
[35]
Their results show that the learned canonicalizers outperform fixed canonicalizers
developed a neural network that learns the canonicalization transformation, which enables plug-and-play equivariance, e.g., orthogonalizing learned features via the Gram- Schmidt process [Trefethen and Bau, 2022]. Their results show that the learned canonicalizers outperform fixed canonicalizers. However, Dym et al
work page 2022
-
[36]
is that when training is initialized, the canonicalizing energy s is random. This leads each datapoint to be randomly transformed, so the task neural network initially has to perform well at all orientations of the data. This can lead the task network to ultimately learn an “average behavior, ” not specializing in any special orientation but rather perfor...
work page 2023
-
[37]
iteratively reduces the orientation variance of the training set by iteratively reorienting datapoints that lead to a large loss. We note that these approaches are rather different from our prior maximization method, and they do not try to address the continuity problem in canonicalization. A.1.3 Weighted Canonicalization The energy-based canonicalization...
work page 2023
-
[38]
also discusses continuity preservation, but their approach is different from ours. In the work of Shumaylov et al. [2025], they define the notion of weighted canonicalization, which is a similar concept to the weighted frame introduced by Dym et al. [2024]. Here, to each datapoint there is an assigned probability measure over the orbit of the datapoint. N...
work page 2025
-
[39]
does not involve network retraining, uses the foundation models as is, and performs canonicalization entirely at inference by optimizing over transformations. While it achieves strong empirical performance, its canonicalization mapping is not guaranteed to be continuous, and in fact, continuity is not discussed. 20 Therefore, small input changes may cause...
work page 2024
-
[40]
it avoids computational intractability of full group averaging, especially for large or continuous groups. Recent work [Lin et al., 2024b] proposes minimal frame averaging that attains strong symmetry coverage with small frames. Domain-specific frame averaging methods [Duval et al., 2023b, Atzmon et al., 2022] show that it can be deployed in material mode...
work page 2022
-
[41]
the canonicalization is a function solely of the datapoint, and not the task network. Then, they define a variant of frame averaging, called weighted frame averaging, in which to each datapoint there is an associated probability distribution over the group, and the frame averaging is performed with respect to this measure. This construction yields continu...
work page 2023
-
[42]
Take f=1 I for a Borel set I⊂R
conjugate back. Take f=1 I for a Borel set I⊂R . The indicator function 1 I(L) is an orthogonal projection, since 1 I(L)2 =1 I(L) and1 I(L)∗ =1 I(L). 25 Algorithm 1Random maximization Input:Input g, backbone network f, scalar prior h(x), sampler Sample_U() for u∼P sampled from a probability measure overU, number of random samplesK, gradient descent stepGD...
work page 1966
-
[43]
with f:R→R, the spectral filter simply reduces to the functional-calculus operator acting onX: f(L)X= NX i=1 f(λ i)v iv⊤ i X=V f(Λ)V ⊤X. Spectral graph neural networks [Defferrard et al., 2016b, Kipf, 2016, Levie et al., 2018] compose such filters with pointwise nonlinearities, using trainablegat each layer. E Application of Adaptive Canonicalization: Tut...
work page 2016
-
[44]
prove the following concentration inequality for maxima. Lemma 22(Concentration inequality for volume retaining space [Cordonnier et al., 2024]).Let (X, P) be a probability space with the (r0, κ)-volume retaining property and let g:X 2 →R q be Kg-Lipschitz. For any ρ≥exp(−nκr d 02d), for any random variablesX 1, . . . , Xn i.i.d. ∼P, with probability at l...
work page 2024
-
[45]
However, these eigenvectors are not uniquely defined
E.3 Construction Details for Anisotropic Nonlinear Spectral Filters In spectral methods for graphs, we often use eigenvectors as a core component for graph representation learning. However, these eigenvectors are not uniquely defined. For each eigenvector we can flip its sign, and when an eigenvalue has multiplicity larger than one, any orthogonal basis o...
work page 2017
-
[46]
DGCNN [Wang et al., 2019] constructs dynamic k-nearest graphs by computing G= (V, E) where E={(i, j) :j∈kNN(x i, k)}. Then, the edge convolution is performed by computing the edge features and applying a max pooling: x′ i = Pool(i,j)∈E(ReLU(Ψ(xj −x i,x i))). Applying adaptive canonicalization to the DGCNN architecture, we define a class-specific orientati...
work page 2019
-
[47]
Experiments are conducted on an Nvidia DGX A100
All models are implemented in PyTorch and optimized with the Adam optimizer [Kingma and Ba, 2014]. Experiments are conducted on an Nvidia DGX A100. The output of the GNN is then passed to an MLP, followed by a softmax classifier. F.2 Graph Classification on TUDataset Datasets and Experimental Setup.We consider five graph classification benchmarks from TUD...
work page 2014
-
[48]
Results are averaged over 10 random splits, with mean accuracy and standard deviation reported
Following the random split protocol [Ma et al., 2019, Ying et al., 2018, Zhang et al., 2019b], we partition the dataset into 80% training, 10% validation, and 10% testing. Results are averaged over 10 random splits, with mean accuracy and standard deviation reported. Competing Baselines.We evaluate on medium-scale graph classification benchmarks from TUDa...
work page 2019
-
[49]
The models are implemented using PyTorch, optimized with the Adam optimizer [Kingma and Ba, 2014]. An early stopping strategy is applied, where training halts if the validation loss does not improve for 100 consecutive epochs. The hyperparameters are selected through a grid search, conducted via Optuna [Akiba et al., 2019], with with the learning rate and...
work page 2014
-
[50]
Experiments are conducted on an Nvidia DGX A100
The output representations are then passed into an MLP followed by a softmax layer, and predictions are obtained by optimizing a cross-entropy loss function. Experiments are conducted on an Nvidia DGX A100. 32 F.3 Molecular Classification on OGB Datasets Datasets and Experimental Setup.We evaluate on larger-scale benchmarks from the Open Graph Benchmark (...
work page 2020
-
[51]
All hyperparameters are tuned using Optuna [Akiba et al., 2019]
Additionally, the batch size is chosen from {32,64,128,256} and the weight decay is chosen from {10−4,10 −5,10 −6}. All hyperparameters are tuned using Optuna [Akiba et al., 2019]. The experiments are conducted on an NVIDIA A100 GPU. F.4 ModelNet40 Point Cloud Classification Datasets and Experimental Setup.Our evaluation for point cloud classification was...
work page 2019
-
[52]
We attribute the slightly worse performance to the potential pooling loss
We see that the node-to-graph construction achieves performance closely aligned with, and in some cases approaching, that of the direct graph-level canonicalization. We attribute the slightly worse performance to the potential pooling loss. Table 6:Graph classification performance on TUDataset using adaptive canonicalization. Comparison between direct gra...
work page 2018
-
[53]
We see that using the dyadic partitions performs better than using the uniform partition. tion provided by dyadic bands, which could more effectively isolate band-wise unitary actions that commute with the chosen GSO. We also note that spectral band design can be realized in more flexible and expressive ways, for example, through attention as in SpecForme...
work page 2023
-
[54]
For grid size and sinusoidal period, performance remains stable across the tested ranges
Overall, we observe that our method is reasonably robust. For grid size and sinusoidal period, performance remains stable across the tested ranges. For the noise level, small to moderate noise leads to similar performance, with a degradation only when the noise becomes large enough that it effectively corrupts the underlying structure of the data. For the...
work page 2016
-
[55]
We see that truncation-based prior maximization improves classification performance over the standard vanilla baseline. This implies that our method enables the model to adaptively select a canonical truncation that enhances downstream performance. In addition, we observe that the selected canonical crops tend to tightly focus on the main object while dis...
work page 2021
discussion (0)
Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.