Invariance-inducing regularization using worst-case transformations suffices to boost accuracy and spatial robustness

Christina Heinze-Deml; Fanny Yang; Zuowen Wang

Invariance-inducing regularization on worst-case spatial transformations boosts both accuracy and robustness with no trade-off.

Reviewed by Pith at T0; open to challenge. T0 means a machine referee read the full paper against a public rubric. the ladder, T0–T4 →

Challenge this review Re-run · record.json Download PDF Read on arXiv ↗

T0 review · grok-4.3

2026-05-25 15:41 UTC pith:BFCF6WEY

load-bearing objection The paper shows adding invariance regularization on worst-case spatial transforms cuts CIFAR10 error 20% relative while also lifting SVHN clean accuracy, with a no-trade-off proof only in the infinite-data limit. the 1 major comments →

arxiv 1906.11235 v1 pith:BFCF6WEY submitted 2019-06-26 cs.LG cs.CVstat.ML

Invariance-inducing regularization using worst-case transformations suffices to boost accuracy and spatial robustness

Fanny Yang , Zuowen Wang , Christina Heinze-Deml This is my paper

classification cs.LG cs.CVstat.ML

keywords invariance regularizationspatial robustnessadversarial trainingtransformation groupsCIFAR10SVHNequivariance

verification ladder T0 review T1 audit T2 compute T3 formal T4 reserved

The pith

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

The paper establishes that adding regularization to enforce invariance under worst-case transformations from a spatial group improves predictive accuracy on both clean images and adversarially transformed ones. On CIFAR10 this yields a 20 percent relative error reduction when layered on top of standard or adversarial training, without raising inference cost, and it surpasses networks built explicitly for spatial equivariance. The same approach improves standard accuracy on SVHN, a dataset with natural orientation variance. The authors prove that the accuracy-robustness trade-off disappears entirely for transformation-group adversaries once the data limit becomes infinite.

Core claim

Invariance-inducing regularization using worst-case transformations from spatial transformation groups increases both standard accuracy and robustness to adversarial spatial transformations, with the no-trade-off phenomenon holding in the infinite data limit and delivering a 20 percent relative error reduction on CIFAR10 when added to standard or adversarial training.

What carries the argument

Invariance-inducing regularization applied to worst-case elements of a transformation group, which simultaneously enforces invariance and improves generalization.

Load-bearing premise

The no-trade-off result and accuracy gains require the infinite data limit for adversarial examples drawn from transformation groups.

What would settle it

A controlled experiment on a finite dataset that still shows a clear accuracy-robustness trade-off after adding the regularization would falsify the claim that the no-trade-off phenomenon transfers to practical regimes.

Watch this falsifier — get emailed when new claim-graph text bears on it.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit.

Desk Editor's Note

The paper shows adding invariance regularization on worst-case spatial transforms cuts CIFAR10 error 20% relative while also lifting SVHN clean accuracy, with a no-trade-off proof only in the infinite-data limit.

read the letter

The main thing to know is that the authors report a 20% relative error drop on CIFAR10 under spatial adversarial examples when they add their invariance-inducing regularizer on top of either standard or adversarial training, and the same trick improves clean accuracy on SVHN. They also claim this beats some hand-designed equivariant networks without extra compute or custom layers. That combination of numbers is the concrete contribution worth noting first. The theory part is a clean statement that no accuracy-robustness trade-off occurs for transformation-group adversaries when data is infinite. The empirical side is presented separately on finite datasets, so the two pieces sit side by side rather than one deriving the other. The work is straightforward to follow from the abstract and does not appear to rest on circular fitting or invented quantities. The soft spot is exactly the one the stress-test flags: the proof is limited to the infinite-data regime, yet the headline empirical claims are finite-data results on CIFAR10 and SVHN. Nothing in the provided abstract bridges how the limit result explains or guarantees the observed finite behavior, which leaves the unified no-trade-off story resting on an unclosed gap. This paper is aimed at people who work on spatial robustness in image classifiers and want a regularization route instead of architectural changes. A reader who cares about regularization for invariance will find the reported gains and the infinite-limit proof useful to examine. It has enough specific empirical and theoretical content to deserve a serious referee, even though the theory-experiment connection will need scrutiny in review.

Referee Report

1 major / 0 minor

Summary. The paper claims that invariance-inducing regularization based on worst-case transformations from transformation groups improves both accuracy and spatial robustness. It reports a 20% relative error reduction on CIFAR10 when added to standard or adversarial training (outperforming handcrafted equivariant networks) without extra compute cost, notes accuracy gains on SVHN, and proves that the no-trade-off phenomenon holds for such adversaries in the infinite-data limit.

Significance. If the central claims hold, the work offers a simple, low-cost regularization approach that simultaneously boosts standard accuracy and robustness to spatial transformations, with a theoretical guarantee in the infinite-data regime. The empirical outperformance of specialized equivariant architectures on CIFAR10 would be a notable practical result.

major comments (1)

[Abstract] Abstract: the no-trade-off result is proven only under the infinite-data limit for transformation-group adversaries, while the headline empirical claim (20% relative error reduction on CIFAR10) and the outperformance of equivariant nets are finite-data observations. The manuscript must explicitly address whether and how the infinite-limit analysis explains or approximates the finite-data behavior; without this bridge the unified claim that the regularization produces no accuracy-robustness trade-off rests on an unbridged gap between theory and experiment.

Simulated Author's Rebuttal

1 responses · 0 unresolved

We thank the referee for the careful review and the constructive suggestion regarding the connection between theory and experiments. We address the point below and will revise the manuscript to make the relationship more explicit.

read point-by-point responses

Referee: [Abstract] Abstract: the no-trade-off result is proven only under the infinite-data limit for transformation-group adversaries, while the headline empirical claim (20% relative error reduction on CIFAR10) and the outperformance of equivariant nets are finite-data observations. The manuscript must explicitly address whether and how the infinite-limit analysis explains or approximates the finite-data behavior; without this bridge the unified claim that the regularization produces no accuracy-robustness trade-off rests on an unbridged gap between theory and experiment.

Authors: We agree that the manuscript would benefit from an explicit discussion of how the infinite-data result relates to the finite-data observations. In the revised version we will add a paragraph in the discussion (and a clarifying sentence in the abstract) noting that the infinite-data analysis establishes the absence of a fundamental accuracy-robustness trade-off for transformation-group adversaries, thereby providing theoretical motivation for the proposed regularization; the finite-data experiments then demonstrate that the same regularization yields measurable gains in practice on standard benchmarks. We will also state that the theory suggests the observed benefits are expected to persist or strengthen with increasing data volume, while acknowledging that a quantitative finite-sample approximation remains an open direction. revision: yes

Circularity Check

0 steps flagged

No circularity: theory conditioned on external infinite-data limit; empirical results independent

full rationale

The paper states a proof of the no-trade-off phenomenon explicitly conditioned on the infinite data limit for transformation-group adversaries, with finite-data CIFAR10/SVHN results presented as separate empirical outcomes. No quoted equations or self-citations reduce any central claim to a fitted input, self-definition, or author-prior ansatz by construction. The derivation chain is self-contained against external benchmarks (infinite-limit math and held-out test sets).

Axiom & Free-Parameter Ledger

0 free parameters · 1 axioms · 0 invented entities

The central claims rest on empirical results from CIFAR10 and SVHN plus a theoretical proof that holds only in the infinite data limit; no free parameters, invented entities, or additional axioms are mentioned in the abstract.

axioms (1)

domain assumption The no-trade-off phenomenon holds for adversarial examples from transformation groups in the infinite data limit
Explicitly stated in the abstract as the condition under which the proof applies.

pith-pipeline@v0.9.0 · 5642 in / 1478 out tokens · 33172 ms · 2026-05-25T15:41:16.184828+00:00 · methodology

0 comments

read the original abstract

This work provides theoretical and empirical evidence that invariance-inducing regularizers can increase predictive accuracy for worst-case spatial transformations (spatial robustness). Evaluated on these adversarially transformed examples, we demonstrate that adding regularization on top of standard or adversarial training reduces the relative error by 20% for CIFAR10 without increasing the computational cost. This outperforms handcrafted networks that were explicitly designed to be spatial-equivariant. Furthermore, we observe for SVHN, known to have inherent variance in orientation, that robust training also improves standard accuracy on the test set. We prove that this no-trade-off phenomenon holds for adversarial examples from transformation groups in the infinite data limit.

Figures

Figures reproduced from arXiv: 1906.11235 by Christina Heinze-Deml, Fanny Yang, Zuowen Wang.

**Figure 1.** Figure 1: Example images and classifications by the Standard model. (a) An image that is correctly classified for most of the rotations in the considered grid. (b) One rotation for which the image shown in (b) is misclassified as “airplane”. On top of interpolation, rotation also creates edge artifacts at the boundaries, as the image is only sampled in a bounded set. The empty space that results from translating and… view at source ↗

**Figure 2.** Figure 2: Mean runtime for different methods on CIFAR-10. The connected points correspond to Wo-k defenses with k ∈ {1, 10, 20}. 4 Empirical Results We now compare the natural test accuracy (standard accuracy on the test set, abbreviated as nat) and test grid accuracy (as defined in Sec. 3.3, abbreviated as rob) achieved by standard and regularized (adversarial) training techniques as well as specialized spatial equ… view at source ↗

**Figure 3.** Figure 3: Illustration of an example where one group orbit [PITH_FULL_IMAGE:figures/full_fig_p016_3.png] view at source ↗

**Figure 4.** Figure 4: Test grid accuracy (first row) and test natural accuracy (second row) as a function of the [PITH_FULL_IMAGE:figures/full_fig_p019_4.png] view at source ↗

**Figure 5.** Figure 5: Test grid accuracy (first row) and test natural accuracy (second row) as a function of the [PITH_FULL_IMAGE:figures/full_fig_p020_5.png] view at source ↗

**Figure 6.** Figure 6: For 100 randomly chosen examples from the CIFAR-10 dataset, we show which rotations lead to a [PITH_FULL_IMAGE:figures/full_fig_p024_6.png] view at source ↗

discussion (0)

Reference graph

Works this paper leans on

51 extracted references · 51 canonical work pages · 5 internal anchors

[1]

Abadi, A

M. Abadi, A. Agarwal, P. Barham, E. Brevdo, et al. TensorFlow: Large-scale machine learning on heterogeneous systems, 2015. Software available from tensorﬂow.org

work page 2015
[2]

Strike (with) a Pose: Neural Networks Are Easily Fooled by Strange Poses of Familiar Objects

Michael A Alcorn, Qi Li, Zhitao Gong, Chengfei Wang, Long Mai, Wei-Shinn Ku, and Anh Nguyen. Strike (with) a pose: Neural networks are easily fooled by strange poses of familiar objects.arXiv preprint arXiv:1811.11553, 2018

work page internal anchor Pith review Pith/arXiv arXiv 2018
[3]

Document image defect models

Henry S Baird. Document image defect models. InStructured Document Image Analysis , pages 546–556. Springer, 1992

work page 1992
[4]

Universal approximation bounds for superpositions of a sigmoidal function.IEEE Trans

Andrew R Barron. Universal approximation bounds for superpositions of a sigmoidal function.IEEE Trans. Info. Theory, 39(3):930–945, 1993

work page 1993
[5]

Princeton University Press, 2009

Aharon Ben-Tal, Laurent El Ghaoui, and Arkadi Nemirovski.Robust optimization, volume 28. Princeton University Press, 2009

work page 2009
[6]

Towards evaluating the robustness of neural networks

Nicholas Carlini and David Wagner. Towards evaluating the robustness of neural networks. InProceedings of the IEEE Symposium on Security and Privacy (SP) , pages 39–57. IEEE, 2017

work page 2017
[7]

Learning rotation-invariant and Fisher discriminativeconvolutionalneural networks forobject detection

Gong Cheng, Junwei Han, Peicheng Zhou, and Dong Xu. Learning rotation-invariant and Fisher discriminativeconvolutionalneural networks forobject detection. IEEE Transactions on Image Processing, 28(1):265–278, 2019

work page 2019
[8]

Group equivariant convolutional networks

Taco Cohen and Max Welling. Group equivariant convolutional networks. In Proceedings of the International Conference on Machine Learning , pages 2990–2999, 2016

work page 2016
[9]

Spherical CNNs

Taco S Cohen, Mario Geiger, Jonas Köhler, and Max Welling. Spherical CNNs. InProceedings of the International Conference on Learning Representations , 2018

work page 2018
[10]

Robustness of Rotation-Equivariant Networks to Adversarial Perturbations

Beranger Dumont, Simona Maggio, and Pablo Montalvo. Robustness of rotation-equivariant networks to adversarial perturbations. arXiv preprint arXiv:1802.06627 , 2018

work page internal anchor Pith review Pith/arXiv arXiv 2018
[11]

Exploring the landscape of spatial robustness

Logan Engstrom, Brandon Tran, Dimitris Tsipras, Ludwig Schmidt, and Aleksander Madry. Exploring the landscape of spatial robustness. InProceedings of the International Conference on Machine Learning , 2019

work page 2019
[12]

Polar transformer networks

Carlos Esteves, Christine Allen-Blanchette, Xiaowei Zhou, and Kostas Daniilidis. Polar transformer networks. In Proceedings of the International Conference on Learning Representations , 2018

work page 2018
[13]

Fawzi and P

A. Fawzi and P. Frossard. Manitest: Are classiﬁers really invariant? In British Machine Vision Conference (BMVC), 2015

work page 2015
[14]

Generalisation in humans and deep neural networks

Robert Geirhos, Carlos RM Temme, Jonas Rauber, Heiko H Schütt, Matthias Bethge, and Felix A Wichmann. Generalisation in humans and deep neural networks. InAdvances in Neural Information Processing Systems, pages 7549–7561, 2018

work page 2018
[15]

arXiv preprint arXiv:1708.02691 , year=

Boris Hanin. Universal function approximation by deep neural nets with bounded width and relu activations. arXiv preprint arXiv:1708.02691 , 2017

work page arXiv 2017
[16]

Deep residual learning for image recognition

Kaiming He, Xiangyu Zhang, Shaoqing Ren, and Jian Sun. Deep residual learning for image recognition. In Proceedings of the IEEE Conference on Computer Vision and Patern Recognition , pages 770–778, 2016

work page 2016
[17]

Conditional Variance Penalties and Domain Shift Robustness

Christina Heinze-Deml and Nicolai Meinshausen. Conditional variance penalties and domain shift robustness. arXiv preprint arXiv:1710.11469 , 2017

work page internal anchor Pith review Pith/arXiv arXiv 2017
[18]

Multilayer feedforward networks are universal approximators

Kurt Hornik, Maxwell Stinchcombe, and Halbert White. Multilayer feedforward networks are universal approximators. Neural networks, 2(5):359–366, 1989

work page 1989
[19]

Spatial Transformer Networks

Max Jaderberg, Karen Simonyan, Andrew Zisserman, et al. Spatial Transformer Networks. InAdvances in Neural Information Processing Systems , pages 2017–2025, 2015

work page 2017
[20]

Geometric robustness of deep networks: analysis and improvement

Can Kanbak, Seyed-Mohsen Moosavi-Dezfooli, and Pascal Frossard. Geometric robustness of deep networks: analysis and improvement. InProceedings of the IEEE Conference on Computer Vision and Patern Recognition, pages 4441–4449, 2018

work page 2018
[21]

Adversarial Logit Pairing

Harini Kannan, Alexey Kurakin, and Ian Goodfellow. Adversarial Logit Pairing. arXiv preprint arXiv:1803.06373, 2018

work page internal anchor Pith review Pith/arXiv arXiv 2018
[22]

Learning multiple layers of features from tiny images

Alex Krizhevsky and Geoﬀrey Hinton. Learning multiple layers of features from tiny images. Technical Report 4, University of Toronto, 2009

work page 2009
[23]

Adversarial examples in the physical world

Alexey Kurakin, Ian Goodfellow, and Samy Bengio. Adversarial examples in the physical world.arXiv preprint arXiv:1607.02533, 2016

work page internal anchor Pith review Pith/arXiv arXiv 2016
[24]

TI-POOLING: Transformation-invariant pooling for feature learning in convolutional neural networks

Dmitry Laptev, Nikolay Savinov, Joachim M Buhmann, and Marc Pollefeys. TI-POOLING: Transformation-invariant pooling for feature learning in convolutional neural networks. InProceedings of the IEEE Conference on Computer Vision and Patern Recognition , pages 289–297, 2016

work page 2016
[25]

An empirical evaluation of deep architectures on problems with many factors of variation

Hugo Larochelle, Dumitru Erhan, Aaron Courville, James Bergstra, and Yoshua Bengio. An empirical evaluation of deep architectures on problems with many factors of variation. InProceedings of the 24th International Conference on Machine Learning , pages 473–480, 2007

work page 2007
[26]

Towards deep learning models resistant to adversarial attacks

Aleksander Madry, Aleksandar Makelov, Ludwig Schmidt, Dimitris Tsipras, and Adrian Vladu. Towards deep learning models resistant to adversarial attacks. InProceedings of the International Conference on Learning Representations, 2018

work page 2018
[27]

Rotation equivariant vector ﬁeld networks

Diego Marcos, Michele Volpi, Nikos Komodakis, and Devis Tuia. Rotation equivariant vector ﬁeld networks. In Proceedings of the IEEE International Conference on Computer Vision , pages 5058–5067, 2017

work page 2017
[28]

Diﬀerentiable abstract interpretation for provably robust neural networks

Matthew Mirman, Timon Gehr, and Martin Vechev. Diﬀerentiable abstract interpretation for provably robust neural networks. InProceedings of the International Conference on Machine Learning , pages 3575–3583, 2018

work page 2018
[29]

Deepfool: A simple and accurate method to fool deep neural networks

Seyed-Mohsen Moosavi-Dezfooli, Alhussein Fawzi, and Pascal Frossard. Deepfool: A simple and accurate method to fool deep neural networks. InProceedings of the IEEE Conference on Computer Vision and Patern Recognition, pages 2574–2582, 2016

work page 2016
[30]

Uniﬁed deep supervised domain adaptation and generalization

Saeid Motiian, Marco Piccirilli, Donald A Adjeroh, and Gianfranco Doretto. Uniﬁed deep supervised domain adaptation and generalization. InProceedings of the IEEE International Conference on Computer Vision, volume 2, page 3, 2017

work page 2017
[31]

Cascade adversarial machine learning regularized with a uniﬁed embedding

Taesik Na, Jong Hwan Ko, and Saibal Mukhopadhyay. Cascade adversarial machine learning regularized with a uniﬁed embedding. InProceedings of the International Conference on Learning Representations , 2018

work page 2018
[32]

Netzer, Tao Wang, Adam Coates, Alessandro Bissacco, Bo Wu, and Andrew Y Ng

Y. Netzer, Tao Wang, Adam Coates, Alessandro Bissacco, Bo Wu, and Andrew Y Ng. Reading digits in natural images with unsupervised feature learning. InNIPS workshop on Deep Learning and Unsupervised Feature Learning, page 5, 2011

work page 2011
[33]

Practical black-box attacks against machine learning

Nicolas Papernot, Patrick McDaniel, Ian Goodfellow, Somesh Jha, Z Berkay Celik, and Ananthram Swami. Practical black-box attacks against machine learning. InProceedings of the ACM Asia Conference on Computer and Communications Security , pages 506–519. ACM, 2017

work page 2017
[34]

Towards practical veriﬁcation of machine learning: The case of computer vision systems.arXiv preprint arXiv:1712.01785 , 2017

Kexin Pei, Yinzhi Cao, Junfeng Yang, and Suman Jana. Towards practical veriﬁcation of machine learning: The case of computer vision systems.arXiv preprint arXiv:1712.01785 , 2017

work page arXiv 2017
[35]

Certiﬁed defenses against adversarial examples

Aditi Raghunathan, Jacob Steinhardt, and Percy Liang. Certiﬁed defenses against adversarial examples. In Proceedings of the International Conference on Learning Representations , 2018

work page 2018
[36]

Duchi, and Percy Liang

Aditi Raghunathan, Sang Michael Xie, Fanny Yang, John C. Duchi, and Percy Liang. Adversarial training can hurt generalization.arXiv preprint arXiv:1906.06032 , 2019

work page arXiv 1906
[37]

Defense-GAN: Protecting classiﬁers against adversarial attacks using generative models

Pouya Samangouei, Maya Kabkab, and Rama Chellappa. Defense-GAN: Protecting classiﬁers against adversarial attacks using generative models. InProceedings of the International Conference on Learning Representations, 2018

work page 2018
[38]

Simonyan and A

K. Simonyan and A. Zisserman. Very deep convolutional networks for large-scale image recognition. In Proceedings of the International Conference on Learning Representations , 2015

work page 2015
[39]

Certiﬁable distributional robustness with principled adversarial training

Aman Sinha, Hongseok Namkoong, and John Duchi. Certiﬁable distributional robustness with principled adversarial training. InProceedings of the International Conference on Learning Representations , 2018

work page 2018
[40]

Intriguing properties of neural networks

Christian Szegedy, Wojciech Zaremba, Ilya Sutskever, Joan Bruna, Dumitru Erhan, Ian Goodfellow, and Rob Fergus. Intriguing properties of neural networks. InProceedings of the International Conference on Learning Representations, 2014

work page 2014
[41]

Equivariant Transformer Networks

Kai Sheng Tai, Peter Bailis, and Gregory Valiant. Equivariant Transformer Networks. InProceedings of the International Conference on Machine Learning , 2019

work page 2019
[42]

Robustness may be at odds with accuracy

Dimitris Tsipras, Shibani Santurkar, Logan Engstrom, Alexander Turner, and Aleksander Madry. Robustness may be at odds with accuracy. InProceedings of the International Conference on Learning Representations, 2019

work page 2019
[43]

Learning steerable ﬁlters for rotation equivariant CNNs

Maurice Weiler, Fred A Hamprecht, and Martin Storath. Learning steerable ﬁlters for rotation equivariant CNNs. In Proceedings of the IEEE Conference on Computer Vision and Patern Recognition , 2018

work page 2018
[44]

Provable defenses against adversarial examples via the convex outer adversarial polytope

Eric Wong and Zico Kolter. Provable defenses against adversarial examples via the convex outer adversarial polytope. In International Conference on Machine Learning , pages 5283–5292, 2018

work page 2018
[45]

Harmonic networks: Deep translation and rotation equivariance

Daniel E Worrall, Stephan J Garbin, Daniyar Turmukhambetov, and Gabriel J Brostow. Harmonic networks: Deep translation and rotation equivariance. In Proceedings of the IEEE Conference on Computer Vision and Patern Recognition , pages 5028–5037, 2017

work page 2017
[46]

Qizhe Xie, Zihang Dai, Eduard Hovy, Minh-Thang Luong, and Quoc V. Le. Unsupervised data augmentation. arXiv preprint arXiv:1904.12848 , 2019

work page arXiv 1904
[47]

Yaeger, Richard F

Larry S. Yaeger, Richard F. Lyon, and Brandyn J. Webb. Eﬀective training of a neural network character classiﬁer for word recognition. InAdvances in Neural Information Processing Systems , pages 807–816, 1997

work page 1997
[48]

Understanding deep learning requires rethinking generalization

Chiyuan Zhang, Samy Bengio, Moritz Hardt, Benjamin Recht, and Oriol Vinyals. Understanding deep learning requires rethinking generalization. InProceedings of the International Conference on Learning Representations, 2015

work page 2015
[49]

Xing, Laurent El Ghaoui, and Michael I

Hongyang Zhang, Yaodong Yu, Jiantao Jiao, Eric P. Xing, Laurent El Ghaoui, and Michael I. Jordan. Theoretically principled trade-oﬀ between robustness and accuracy. InProceedings of the International Conference on Machine Learning , 2019

work page 2019
[50]

Improving the robustness of deep neural networks via stability training

Stephan Zheng, Yang Song, Thomas Leung, and Ian Goodfellow. Improving the robustness of deep neural networks via stability training. InProceedings of the IEEE Conference on Computer Vision and Patern Recognition, pages 4480–4488, 2016

work page 2016
[51]

attack transformation

Yanzhao Zhou, Qixiang Ye, Qiang Qiu, and Jianbin Jiao. Oriented response networks. InProceedings of the IEEE Conference on Computer Vision and Patern Recognition , pages 519–528, 2017. A Appendix A.1 Rigorous deﬁnition of transformation sets and choice ofS In the following we introduce the concepts that are needed to rigorously deﬁne transformation sets t...

work page 2017

[1] [1]

Abadi, A

M. Abadi, A. Agarwal, P. Barham, E. Brevdo, et al. TensorFlow: Large-scale machine learning on heterogeneous systems, 2015. Software available from tensorﬂow.org

work page 2015

[2] [2]

Strike (with) a Pose: Neural Networks Are Easily Fooled by Strange Poses of Familiar Objects

Michael A Alcorn, Qi Li, Zhitao Gong, Chengfei Wang, Long Mai, Wei-Shinn Ku, and Anh Nguyen. Strike (with) a pose: Neural networks are easily fooled by strange poses of familiar objects.arXiv preprint arXiv:1811.11553, 2018

work page internal anchor Pith review Pith/arXiv arXiv 2018

[3] [3]

Document image defect models

Henry S Baird. Document image defect models. InStructured Document Image Analysis , pages 546–556. Springer, 1992

work page 1992

[4] [4]

Universal approximation bounds for superpositions of a sigmoidal function.IEEE Trans

Andrew R Barron. Universal approximation bounds for superpositions of a sigmoidal function.IEEE Trans. Info. Theory, 39(3):930–945, 1993

work page 1993

[5] [5]

Princeton University Press, 2009

Aharon Ben-Tal, Laurent El Ghaoui, and Arkadi Nemirovski.Robust optimization, volume 28. Princeton University Press, 2009

work page 2009

[6] [6]

Towards evaluating the robustness of neural networks

Nicholas Carlini and David Wagner. Towards evaluating the robustness of neural networks. InProceedings of the IEEE Symposium on Security and Privacy (SP) , pages 39–57. IEEE, 2017

work page 2017

[7] [7]

Learning rotation-invariant and Fisher discriminativeconvolutionalneural networks forobject detection

Gong Cheng, Junwei Han, Peicheng Zhou, and Dong Xu. Learning rotation-invariant and Fisher discriminativeconvolutionalneural networks forobject detection. IEEE Transactions on Image Processing, 28(1):265–278, 2019

work page 2019

[8] [8]

Group equivariant convolutional networks

Taco Cohen and Max Welling. Group equivariant convolutional networks. In Proceedings of the International Conference on Machine Learning , pages 2990–2999, 2016

work page 2016

[9] [9]

Spherical CNNs

Taco S Cohen, Mario Geiger, Jonas Köhler, and Max Welling. Spherical CNNs. InProceedings of the International Conference on Learning Representations , 2018

work page 2018

[10] [10]

Robustness of Rotation-Equivariant Networks to Adversarial Perturbations

Beranger Dumont, Simona Maggio, and Pablo Montalvo. Robustness of rotation-equivariant networks to adversarial perturbations. arXiv preprint arXiv:1802.06627 , 2018

work page internal anchor Pith review Pith/arXiv arXiv 2018

[11] [11]

Exploring the landscape of spatial robustness

Logan Engstrom, Brandon Tran, Dimitris Tsipras, Ludwig Schmidt, and Aleksander Madry. Exploring the landscape of spatial robustness. InProceedings of the International Conference on Machine Learning , 2019

work page 2019

[12] [12]

Polar transformer networks

Carlos Esteves, Christine Allen-Blanchette, Xiaowei Zhou, and Kostas Daniilidis. Polar transformer networks. In Proceedings of the International Conference on Learning Representations , 2018

work page 2018

[13] [13]

Fawzi and P

A. Fawzi and P. Frossard. Manitest: Are classiﬁers really invariant? In British Machine Vision Conference (BMVC), 2015

work page 2015

[14] [14]

Generalisation in humans and deep neural networks

Robert Geirhos, Carlos RM Temme, Jonas Rauber, Heiko H Schütt, Matthias Bethge, and Felix A Wichmann. Generalisation in humans and deep neural networks. InAdvances in Neural Information Processing Systems, pages 7549–7561, 2018

work page 2018

[15] [15]

arXiv preprint arXiv:1708.02691 , year=

Boris Hanin. Universal function approximation by deep neural nets with bounded width and relu activations. arXiv preprint arXiv:1708.02691 , 2017

work page arXiv 2017

[16] [16]

Deep residual learning for image recognition

Kaiming He, Xiangyu Zhang, Shaoqing Ren, and Jian Sun. Deep residual learning for image recognition. In Proceedings of the IEEE Conference on Computer Vision and Patern Recognition , pages 770–778, 2016

work page 2016

[17] [17]

Conditional Variance Penalties and Domain Shift Robustness

Christina Heinze-Deml and Nicolai Meinshausen. Conditional variance penalties and domain shift robustness. arXiv preprint arXiv:1710.11469 , 2017

work page internal anchor Pith review Pith/arXiv arXiv 2017

[18] [18]

Multilayer feedforward networks are universal approximators

Kurt Hornik, Maxwell Stinchcombe, and Halbert White. Multilayer feedforward networks are universal approximators. Neural networks, 2(5):359–366, 1989

work page 1989

[19] [19]

Spatial Transformer Networks

Max Jaderberg, Karen Simonyan, Andrew Zisserman, et al. Spatial Transformer Networks. InAdvances in Neural Information Processing Systems , pages 2017–2025, 2015

work page 2017

[20] [20]

Geometric robustness of deep networks: analysis and improvement

Can Kanbak, Seyed-Mohsen Moosavi-Dezfooli, and Pascal Frossard. Geometric robustness of deep networks: analysis and improvement. InProceedings of the IEEE Conference on Computer Vision and Patern Recognition, pages 4441–4449, 2018

work page 2018

[21] [21]

Adversarial Logit Pairing

Harini Kannan, Alexey Kurakin, and Ian Goodfellow. Adversarial Logit Pairing. arXiv preprint arXiv:1803.06373, 2018

work page internal anchor Pith review Pith/arXiv arXiv 2018

[22] [22]

Learning multiple layers of features from tiny images

Alex Krizhevsky and Geoﬀrey Hinton. Learning multiple layers of features from tiny images. Technical Report 4, University of Toronto, 2009

work page 2009

[23] [23]

Adversarial examples in the physical world

Alexey Kurakin, Ian Goodfellow, and Samy Bengio. Adversarial examples in the physical world.arXiv preprint arXiv:1607.02533, 2016

work page internal anchor Pith review Pith/arXiv arXiv 2016

[24] [24]

TI-POOLING: Transformation-invariant pooling for feature learning in convolutional neural networks

Dmitry Laptev, Nikolay Savinov, Joachim M Buhmann, and Marc Pollefeys. TI-POOLING: Transformation-invariant pooling for feature learning in convolutional neural networks. InProceedings of the IEEE Conference on Computer Vision and Patern Recognition , pages 289–297, 2016

work page 2016

[25] [25]

An empirical evaluation of deep architectures on problems with many factors of variation

Hugo Larochelle, Dumitru Erhan, Aaron Courville, James Bergstra, and Yoshua Bengio. An empirical evaluation of deep architectures on problems with many factors of variation. InProceedings of the 24th International Conference on Machine Learning , pages 473–480, 2007

work page 2007

[26] [26]

Towards deep learning models resistant to adversarial attacks

Aleksander Madry, Aleksandar Makelov, Ludwig Schmidt, Dimitris Tsipras, and Adrian Vladu. Towards deep learning models resistant to adversarial attacks. InProceedings of the International Conference on Learning Representations, 2018

work page 2018

[27] [27]

Rotation equivariant vector ﬁeld networks

Diego Marcos, Michele Volpi, Nikos Komodakis, and Devis Tuia. Rotation equivariant vector ﬁeld networks. In Proceedings of the IEEE International Conference on Computer Vision , pages 5058–5067, 2017

work page 2017

[28] [28]

Diﬀerentiable abstract interpretation for provably robust neural networks

Matthew Mirman, Timon Gehr, and Martin Vechev. Diﬀerentiable abstract interpretation for provably robust neural networks. InProceedings of the International Conference on Machine Learning , pages 3575–3583, 2018

work page 2018

[29] [29]

Deepfool: A simple and accurate method to fool deep neural networks

Seyed-Mohsen Moosavi-Dezfooli, Alhussein Fawzi, and Pascal Frossard. Deepfool: A simple and accurate method to fool deep neural networks. InProceedings of the IEEE Conference on Computer Vision and Patern Recognition, pages 2574–2582, 2016

work page 2016

[30] [30]

Uniﬁed deep supervised domain adaptation and generalization

Saeid Motiian, Marco Piccirilli, Donald A Adjeroh, and Gianfranco Doretto. Uniﬁed deep supervised domain adaptation and generalization. InProceedings of the IEEE International Conference on Computer Vision, volume 2, page 3, 2017

work page 2017

[31] [31]

Cascade adversarial machine learning regularized with a uniﬁed embedding

Taesik Na, Jong Hwan Ko, and Saibal Mukhopadhyay. Cascade adversarial machine learning regularized with a uniﬁed embedding. InProceedings of the International Conference on Learning Representations , 2018

work page 2018

[32] [32]

Netzer, Tao Wang, Adam Coates, Alessandro Bissacco, Bo Wu, and Andrew Y Ng

Y. Netzer, Tao Wang, Adam Coates, Alessandro Bissacco, Bo Wu, and Andrew Y Ng. Reading digits in natural images with unsupervised feature learning. InNIPS workshop on Deep Learning and Unsupervised Feature Learning, page 5, 2011

work page 2011

[33] [33]

Practical black-box attacks against machine learning

Nicolas Papernot, Patrick McDaniel, Ian Goodfellow, Somesh Jha, Z Berkay Celik, and Ananthram Swami. Practical black-box attacks against machine learning. InProceedings of the ACM Asia Conference on Computer and Communications Security , pages 506–519. ACM, 2017

work page 2017

[34] [34]

Towards practical veriﬁcation of machine learning: The case of computer vision systems.arXiv preprint arXiv:1712.01785 , 2017

Kexin Pei, Yinzhi Cao, Junfeng Yang, and Suman Jana. Towards practical veriﬁcation of machine learning: The case of computer vision systems.arXiv preprint arXiv:1712.01785 , 2017

work page arXiv 2017

[35] [35]

Certiﬁed defenses against adversarial examples

Aditi Raghunathan, Jacob Steinhardt, and Percy Liang. Certiﬁed defenses against adversarial examples. In Proceedings of the International Conference on Learning Representations , 2018

work page 2018

[36] [36]

Duchi, and Percy Liang

Aditi Raghunathan, Sang Michael Xie, Fanny Yang, John C. Duchi, and Percy Liang. Adversarial training can hurt generalization.arXiv preprint arXiv:1906.06032 , 2019

work page arXiv 1906

[37] [37]

Defense-GAN: Protecting classiﬁers against adversarial attacks using generative models

Pouya Samangouei, Maya Kabkab, and Rama Chellappa. Defense-GAN: Protecting classiﬁers against adversarial attacks using generative models. InProceedings of the International Conference on Learning Representations, 2018

work page 2018

[38] [38]

Simonyan and A

K. Simonyan and A. Zisserman. Very deep convolutional networks for large-scale image recognition. In Proceedings of the International Conference on Learning Representations , 2015

work page 2015

[39] [39]

Certiﬁable distributional robustness with principled adversarial training

Aman Sinha, Hongseok Namkoong, and John Duchi. Certiﬁable distributional robustness with principled adversarial training. InProceedings of the International Conference on Learning Representations , 2018

work page 2018

[40] [40]

Intriguing properties of neural networks

Christian Szegedy, Wojciech Zaremba, Ilya Sutskever, Joan Bruna, Dumitru Erhan, Ian Goodfellow, and Rob Fergus. Intriguing properties of neural networks. InProceedings of the International Conference on Learning Representations, 2014

work page 2014

[41] [41]

Equivariant Transformer Networks

Kai Sheng Tai, Peter Bailis, and Gregory Valiant. Equivariant Transformer Networks. InProceedings of the International Conference on Machine Learning , 2019

work page 2019

[42] [42]

Robustness may be at odds with accuracy

Dimitris Tsipras, Shibani Santurkar, Logan Engstrom, Alexander Turner, and Aleksander Madry. Robustness may be at odds with accuracy. InProceedings of the International Conference on Learning Representations, 2019

work page 2019

[43] [43]

Learning steerable ﬁlters for rotation equivariant CNNs

Maurice Weiler, Fred A Hamprecht, and Martin Storath. Learning steerable ﬁlters for rotation equivariant CNNs. In Proceedings of the IEEE Conference on Computer Vision and Patern Recognition , 2018

work page 2018

[44] [44]

Provable defenses against adversarial examples via the convex outer adversarial polytope

Eric Wong and Zico Kolter. Provable defenses against adversarial examples via the convex outer adversarial polytope. In International Conference on Machine Learning , pages 5283–5292, 2018

work page 2018

[45] [45]

Harmonic networks: Deep translation and rotation equivariance

Daniel E Worrall, Stephan J Garbin, Daniyar Turmukhambetov, and Gabriel J Brostow. Harmonic networks: Deep translation and rotation equivariance. In Proceedings of the IEEE Conference on Computer Vision and Patern Recognition , pages 5028–5037, 2017

work page 2017

[46] [46]

Qizhe Xie, Zihang Dai, Eduard Hovy, Minh-Thang Luong, and Quoc V. Le. Unsupervised data augmentation. arXiv preprint arXiv:1904.12848 , 2019

work page arXiv 1904

[47] [47]

Yaeger, Richard F

Larry S. Yaeger, Richard F. Lyon, and Brandyn J. Webb. Eﬀective training of a neural network character classiﬁer for word recognition. InAdvances in Neural Information Processing Systems , pages 807–816, 1997

work page 1997

[48] [48]

Understanding deep learning requires rethinking generalization

Chiyuan Zhang, Samy Bengio, Moritz Hardt, Benjamin Recht, and Oriol Vinyals. Understanding deep learning requires rethinking generalization. InProceedings of the International Conference on Learning Representations, 2015

work page 2015

[49] [49]

Xing, Laurent El Ghaoui, and Michael I

Hongyang Zhang, Yaodong Yu, Jiantao Jiao, Eric P. Xing, Laurent El Ghaoui, and Michael I. Jordan. Theoretically principled trade-oﬀ between robustness and accuracy. InProceedings of the International Conference on Machine Learning , 2019

work page 2019

[50] [50]

Improving the robustness of deep neural networks via stability training

Stephan Zheng, Yang Song, Thomas Leung, and Ian Goodfellow. Improving the robustness of deep neural networks via stability training. InProceedings of the IEEE Conference on Computer Vision and Patern Recognition, pages 4480–4488, 2016

work page 2016

[51] [51]

attack transformation

Yanzhao Zhou, Qixiang Ye, Qiang Qiu, and Jianbin Jiao. Oriented response networks. InProceedings of the IEEE Conference on Computer Vision and Patern Recognition , pages 519–528, 2017. A Appendix A.1 Rigorous deﬁnition of transformation sets and choice ofS In the following we introduce the concepts that are needed to rigorously deﬁne transformation sets t...

work page 2017