Learning Color Equivariant Representations
Pith reviewed 2026-05-24 00:12 UTC · model grok-4.3
The pith
A lifting layer lets group convolutional networks achieve color equivariance by transforming the input image directly.
A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.
Core claim
Group convolutional neural networks achieve equivariance to color variation through a lifting layer that transforms the input image directly rather than the convolutional filters. This construction extends hue equivariance to saturation and luminance shifts, eliminates invalid RGB values, and produces networks with strong generalization to out-of-distribution perceptual changes plus improved sample efficiency over conventional architectures.
What carries the argument
The lifting layer, which transforms the input image directly to support color group convolutions while preserving valid RGB values.
If this is right
- The networks generalize strongly to out-of-distribution perceptual variations in hue, saturation, and luminance.
- Sample efficiency improves relative to conventional convolutional architectures.
- Performance exceeds competitive baselines on both synthetic and real-world image datasets.
Where Pith is reading between the lines
- The same lifting approach could be tested on other continuous perceptual attributes such as contrast or white balance.
- Color-equivariant layers might combine with geometric equivariance layers for joint robustness to lighting and viewpoint changes.
- The method could be evaluated on video sequences to check whether temporal consistency of color transformations holds.
Load-bearing premise
That applying the lifting layer to the input image produces a representation that remains valid and equivariant under color group actions without introducing new artifacts.
What would settle it
Run the same set of color-transformed test images through the network with and without the lifting layer; if the measured equivariance error does not fall by at least three orders of magnitude with the lifting layer, the central claim is false.
Figures
read the original abstract
In this paper, we introduce group convolutional neural networks (GCNNs) equivariant to color variation. GCNNs have been designed for a variety of geometric transformations from 2D and 3D rotation groups, to semi-groups such as scale. Despite the improved interpretability, accuracy and generalizability of these architectures, GCNNs have seen limited application in the context of perceptual quantities. Notably, the recent CEConv network uses a GCNN to achieve equivariance to hue transformations by convolving input images with a hue rotated RGB filter. However, this approach leads to invalid RGB values which break equivariance and degrade performance. We resolve these issues with a lifting layer that transforms the input image directly, thereby circumventing the issue of invalid RGB values and improving equivariance error by over three orders of magnitude. Moreover, we extend the notion of color equivariance to include equivariance to saturation and luminance shift. Our hue-, saturation-, luminance- and color-equivariant networks achieve strong generalization to out-of-distribution perceptual variations and improved sample efficiency over conventional architectures. We demonstrate the utility of our approach on synthetic and real world datasets where we consistently outperform competitive baselines.
Editorial analysis
A structured set of objections, weighed in public.
Referee Report
Summary. The paper introduces group convolutional neural networks (GCNNs) that are equivariant to color variations, specifically extending to hue, saturation, and luminance shifts. It proposes a lifting layer that transforms the input image directly (rather than convolving with rotated filters as in CEConv) to avoid invalid RGB values. The authors claim this yields an equivariance error improvement of over three orders of magnitude, leading to stronger out-of-distribution generalization and better sample efficiency than conventional architectures, with demonstrations on synthetic and real-world datasets.
Significance. If the equivariance error reduction and fair baseline comparisons hold under the stated conditions, the work would provide a concrete architectural improvement for perceptual equivariance in GCNNs, potentially aiding generalization in color-sensitive vision tasks. The explicit handling of saturation and luminance beyond hue is a useful extension.
major comments (2)
- [Method (lifting layer description) and Experiments (equivariance error table/figure)] The central claim of the lifting layer improving equivariance error by over three orders of magnitude (abstract and method) is load-bearing for the generalization and sample-efficiency results, yet the manuscript provides no explicit definition of the equivariance error metric (e.g., L2 norm over which group elements, normalization, test distribution), no equation for the lifting operator, and no confirmation that CEConv baselines were reimplemented with identical hyperparameters or color-space handling.
- [Preliminaries / Color Equivariance section] § on color group definitions: the extension to saturation and luminance equivariance is presented as a direct generalization, but the paper does not specify whether the combined color group is a direct product or semidirect product and how the lifting layer composes the individual transformations without introducing additional parameters or breaking the claimed parameter-free property.
minor comments (2)
- [Experiments] Figure captions for the synthetic dataset experiments should explicitly state the number of training samples used for the sample-efficiency curves and whether error bars represent standard deviation over multiple seeds.
- [Preliminaries] Notation for the hue/saturation/luminance transformations is introduced without a compact group-theoretic summary (e.g., explicit generators or parametrization), which would aid reproducibility.
Simulated Author's Rebuttal
We thank the referee for their constructive comments. We address the major comments below and will make revisions to improve the clarity of the manuscript.
read point-by-point responses
-
Referee: [Method (lifting layer description) and Experiments (equivariance error table/figure)] The central claim of the lifting layer improving equivariance error by over three orders of magnitude (abstract and method) is load-bearing for the generalization and sample-efficiency results, yet the manuscript provides no explicit definition of the equivariance error metric (e.g., L2 norm over which group elements, normalization, test distribution), no equation for the lifting operator, and no confirmation that CEConv baselines were reimplemented with identical hyperparameters or color-space handling.
Authors: We agree that these details should be made explicit to support the central claims. The equivariance error is measured as the average L2 distance between f(g·x) and g·f(x) over sampled group elements g in the color group, normalized by the input magnitude, evaluated on a held-out test distribution. We will add the mathematical definition and the equation for the lifting operator L(x) = x transformed in color space directly. For the baselines, CEConv was reimplemented using the same hyperparameters and RGB color space handling as in the original work; we will include a statement confirming the reimplementation details in the experiments section. revision: yes
-
Referee: [Preliminaries / Color Equivariance section] § on color group definitions: the extension to saturation and luminance equivariance is presented as a direct generalization, but the paper does not specify whether the combined color group is a direct product or semidirect product and how the lifting layer composes the individual transformations without introducing additional parameters or breaking the claimed parameter-free property.
Authors: The color group is the direct product of the hue, saturation, and luminance groups since these transformations act independently on the color channels and commute. The lifting layer composes them by applying the transformations sequentially to the input image in HSL color space before lifting to the group, which remains parameter-free as no learned weights are involved in the lifting. We will add this specification to the preliminaries section to clarify the group structure and composition. revision: yes
Circularity Check
No circularity; architectural claim rests on new component and experiments
full rationale
The paper introduces a lifting layer as a novel fix for invalid RGB values in prior hue-equivariant GCNNs (CEConv) and reports empirical gains in equivariance error plus OOD generalization. No equations, derivations, or self-citations appear in the provided text that would reduce any claimed prediction or result to its own inputs by construction. The central claims are presented as consequences of the architectural design choice and benchmark results rather than tautological redefinitions or fitted quantities renamed as predictions. This is a standard non-circular case of an empirical architecture paper.
Axiom & Free-Parameter Ledger
axioms (1)
- domain assumption Group convolutional neural networks can be extended to perceptual color transformations while maintaining equivariance
invented entities (1)
-
lifting layer for direct image transformation
no independent evidence
Lean theorems connected to this paper
-
IndisputableMonolith/Cost/FunctionalEquation.leanwashburn_uniqueness_aczel unclear?
unclearRelation between the paper passage and the cited Recognition theorem.
lifting layer that transforms the input image directly, thereby circumventing the issue of invalid RGB values and improving equivariance error by over three orders of magnitude
-
IndisputableMonolith/Foundation/DimensionForcing.leanalexander_duality_circle_linking unclear?
unclearRelation between the paper passage and the cited Recognition theorem.
We identify variations in saturation and luminance with the 1D translation group... hue group HN ≅ CN
What do these tags mean?
- matches
- The paper's claim is directly supported by a theorem in the formal canon.
- supports
- The theorem supports part of the paper's argument, but the paper may add assumptions or extra steps.
- extends
- The paper goes beyond the formal theorem; the theorem is a base layer rather than the whole result.
- uses
- The paper appears to rely on the theorem as machinery.
- contradicts
- The paper's claim conflicts with a theorem or certificate in the canon.
- unclear
- Pith found a possible connection, but the passage is too broad, indirect, or ambiguous to say the theorem truly supports the claim.
Reference graph
Works this paper leans on
-
[1]
Imagenet classification with deep convolutional neural networks
Alex Krizhevsky, Ilya Sutskever, and Geoffrey E Hinton. Imagenet classification with deep convolutional neural networks. In F. Pereira, C.J. Burges, L. Bottou, and K.Q. Weinberger, editors, Advances in Neural Information Processing Systems, volume 25. Curran Associates, Inc., 2012
work page 2012
-
[2]
Very Deep Convolutional Networks for Large-Scale Image Recognition
Karen Simonyan and Andrew Zisserman. Very deep convolutional networks for large-scale image recognition. arXiv preprint arXiv:1409.1556, 2014
work page internal anchor Pith review Pith/arXiv arXiv 2014
-
[3]
Deep residual learning for image recognition
Kaiming He, Xiangyu Zhang, Shaoqing Ren, and Jian Sun. Deep residual learning for image recognition. In Proceedings of the IEEE conference on computer vision and pattern recognition, pages 770--778, 2016
work page 2016
-
[4]
Understanding image representations by measuring their equivariance and equivalence
Karel Lenc and Andrea Vedaldi. Understanding image representations by measuring their equivariance and equivalence. In Proceedings of the IEEE conference on computer vision and pattern recognition, pages 991--999, 2015
work page 2015
-
[5]
Invariant scattering convolution networks
Joan Bruna and St \'e phane Mallat. Invariant scattering convolution networks. IEEE transactions on pattern analysis and machine intelligence, 35 0 (8): 0 1872--1886, 2013
work page 2013
-
[6]
Group equivariant convolutional networks
Taco Cohen and Max Welling. Group equivariant convolutional networks. In International conference on machine learning, pages 2990--2999. PMLR, 2016
work page 2016
-
[7]
Neural message passing for quantum chemistry
Justin Gilmer, Samuel S Schoenholz, Patrick F Riley, Oriol Vinyals, and George E Dahl. Neural message passing for quantum chemistry. In International conference on machine learning, pages 1263--1272. PMLR, 2017
work page 2017
-
[8]
Pointnet: Deep learning on point sets for 3d classification and segmentation
Charles R Qi, Hao Su, Kaichun Mo, and Leonidas J Guibas. Pointnet: Deep learning on point sets for 3d classification and segmentation. In Proceedings of the IEEE conference on computer vision and pattern recognition, pages 652--660, 2017
work page 2017
-
[9]
Matrix capsules with em routing
Geoffrey E Hinton, Sara Sabour, and Nicholas Frosst. Matrix capsules with em routing. In International conference on learning representations, 2018
work page 2018
-
[10]
Samuel Greydanus, Misko Dzamba, and Jason Yosinski. Hamiltonian neural networks. Advances in neural information processing systems, 32, 2019
work page 2019
-
[11]
Mitko Veta, Paul J Van Diest, Mehdi Jiwa, Shaimaa Al-Janabi, and Josien PW Pluim. Mitosis counting in breast cancer: Object-level interobserver agreement and comparison to an automatic method. PloS one, 11 0 (8): 0 e0161286, 2016
work page 2016
-
[12]
Color representation in deep neural networks
Martin Engilberge, Edo Collins, and Sabine S \"u sstrunk. Color representation in deep neural networks. In 2017 IEEE International Conference on Image Processing (ICIP), pages 2786--2790. IEEE, 2017
work page 2017
-
[13]
Impact of colour on robustness of deep neural networks
Kanjar De and Marius Pedersen. Impact of colour on robustness of deep neural networks. In Proceedings of the IEEE/CVF international conference on computer vision, pages 21--30, 2021
work page 2021
-
[14]
Circle: Color invariant representation learning for unbiased classification of skin lesions
Arezou Pakzad, Kumar Abhishek, and Ghassan Hamarneh. Circle: Color invariant representation learning for unbiased classification of skin lesions. In European Conference on Computer Vision, pages 203--219. Springer, 2022
work page 2022
-
[15]
Learning invariances in neural networks from training data
Gregory Benton, Marc Finzi, Pavel Izmailov, and Andrew G Wilson. Learning invariances in neural networks from training data. Advances in neural information processing systems, 33: 0 17605--17616, 2020
work page 2020
-
[16]
Risi Kondor and Shubhendu Trivedi. On the generalization of equivariance and convolution in neural networks to the action of compact groups. In International Conference on Machine Learning, pages 2747--2755. PMLR, 2018
work page 2018
-
[17]
A general theory of equivariant cnns on homogeneous spaces
Taco S Cohen, Mario Geiger, and Maurice Weiler. A general theory of equivariant cnns on homogeneous spaces. Advances in neural information processing systems, 32, 2019 a
work page 2019
-
[18]
Harmonic networks: Deep translation and rotation equivariance
Daniel E Worrall, Stephan J Garbin, Daniyar Turmukhambetov, and Gabriel J Brostow. Harmonic networks: Deep translation and rotation equivariance. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pages 5028--5037, 2017
work page 2017
-
[19]
Carlos Esteves, Christine Allen-Blanchette, Xiaowei Zhou, and Kostas Daniilidis. Polar transformer networks. arXiv preprint arXiv:1709.01889, 2017
work page internal anchor Pith review Pith/arXiv arXiv 2017
-
[20]
Deep scale-spaces: Equivariance over scale
Daniel Worrall and Max Welling. Deep scale-spaces: Equivariance over scale. Advances in Neural Information Processing Systems, 32, 2019
work page 2019
-
[21]
Empowering networks with scale and rotation equivariance using a similarity convolution
Zikai Sun and Thierry Blu. Empowering networks with scale and rotation equivariance using a similarity convolution. In The Eleventh International Conference on Learning Representations, 2022
work page 2022
-
[22]
Tensor field networks: Rotation- and translation-equivariant neural networks for 3D point clouds
Nathaniel Thomas, Tess Smidt, Steven Kearnes, Lusann Yang, Li Li, Kai Kohlhoff, and Patrick Riley. Tensor field networks: Rotation-and translation-equivariant neural networks for 3d point clouds. arXiv preprint arXiv:1802.08219, 2018
work page internal anchor Pith review Pith/arXiv arXiv 2018
-
[23]
Equivariant multi-view networks
Carlos Esteves, Yinshuang Xu, Christine Allen-Blanchette, and Kostas Daniilidis. Equivariant multi-view networks. In Proceedings of the IEEE/CVF international conference on computer vision, pages 1568--1577, 2019
work page 2019
-
[24]
E (3)-equivariant graph neural networks for data-efficient and accurate interatomic potentials
Simon Batzner, Albert Musaelian, Lixin Sun, Mario Geiger, Jonathan P Mailoa, Mordechai Kornbluth, Nicola Molinari, Tess E Smidt, and Boris Kozinsky. E (3)-equivariant graph neural networks for data-efficient and accurate interatomic potentials. Nature communications, 13 0 (1): 0 2453, 2022
work page 2022
-
[25]
Marc Finzi, Samuel Stanton, Pavel Izmailov, and Andrew Gordon Wilson. Generalizing convolutional neural networks for equivariance to lie groups on arbitrary continuous data. In International Conference on Machine Learning, pages 3165--3176. PMLR, 2020
work page 2020
-
[26]
Gauge equivariant convolutional networks and the icosahedral cnn
Taco Cohen, Maurice Weiler, Berkay Kicanaoglu, and Max Welling. Gauge equivariant convolutional networks and the icosahedral cnn. In International conference on Machine learning, pages 1321--1330. PMLR, 2019 b
work page 2019
-
[27]
Structuring representation geometry with rotationally equivariant contrastive learning
Sharut Gupta, Joshua Robinson, Derek Lim, Soledad Villar, and Stefanie Jegelka. Structuring representation geometry with rotationally equivariant contrastive learning. In The Twelfth International Conference on Learning Representations, 2023
work page 2023
-
[28]
A perception-based color space for illumination-invariant image processing
Hamilton Y Chong, Steven J Gortler, and Todd Zickler. A perception-based color space for illumination-invariant image processing. ACM Transactions on Graphics (TOG), 27 0 (3): 0 1--7, 2008
work page 2008
-
[29]
Color equivariant convolutional networks
Attila Lengyel, Ombretta Strafforello, Robert-Jan Bruintjes, Alexander Gielisse, and Jan van Gemert. Color equivariant convolutional networks. Advances in Neural Information Processing Systems, 36, 2024
work page 2024
-
[30]
Differential geometry and lie groups, volume 12
Jean Gallier and Jocelyn Quaintance. Differential geometry and lie groups, volume 12. Springer, 2020
work page 2020
-
[31]
Mnist handwritten digit database
Yann LeCun, Corinna Cortes, and CJ Burges. Mnist handwritten digit database. ATT Labs [Online]. Available: http://yann.lecun.com/exdb/mnist, 2, 2010
work page 2010
-
[32]
Chris Burgess and Hyunjik Kim. 3d shapes dataset. https://github.com/deepmind/3dshapes-dataset/, 2018
work page 2018
-
[33]
Peter Bandi, Oscar Geessink, Quirine Manson, Marcory Van Dijk, Maschenka Balkenhol, Meyke Hermsen, Babak Ehteshami Bejnordi, Byungjae Lee, Kyunghyun Paeng, Aoxiao Zhong, et al. From detection of individual metastases to classification of lymph node status at the patient level: the camelyon17 challenge. IEEE Transactions on Medical Imaging, 2018
work page 2018
-
[34]
Learning multiple layers of features from tiny images
Alex Krizhevsky, Geoffrey Hinton, et al. Learning multiple layers of features from tiny images. 2009
work page 2009
-
[35]
Adam: A Method for Stochastic Optimization
Diederik P Kingma and Jimmy Ba. Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980, 2014
work page internal anchor Pith review Pith/arXiv arXiv 2014
-
[36]
A fine-grained analysis on distribution shift, 2021
Olivia Wiles, Sven Gowal, Florian Stimberg, Sylvestre Alvise-Rebuffi, Ira Ktena, Krishnamurthy Dvijotham, and Taylan Cemgil. A fine-grained analysis on distribution shift, 2021
work page 2021
discussion (0)
Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.