Parseval Networks: Improving Robustness to Adversarial Examples

arxiv: 1704.08847 · v2 · pith:PX56HRN7new · submitted 2017-04-28 · 📊 stat.ML · cs.AI· cs.CR· cs.LG

Parseval Networks: Improving Robustness to Adversarial Examples

Moustapha Cisse , Piotr Bojanowski , Edouard Grave , Yann Dauphin , Nicolas Usunier This is my paper

classification 📊 stat.ML cs.AIcs.CRcs.LG

keywords networksparsevaladversarialmatricesconvolutionaldeepexampleslayers

0 comments p. Extension

pith:PX56HRN7 Add to your LaTeX paper

What is a Pith Number?

\usepackage{pith}
\pithnumber{PX56HRN7}

Prints a linked pith:PX56HRN7 badge after your title and writes the identifier into PDF metadata. Compiles on arXiv with no extra files. Learn more

read the original abstract

We introduce Parseval networks, a form of deep neural networks in which the Lipschitz constant of linear, convolutional and aggregation layers is constrained to be smaller than 1. Parseval networks are empirically and theoretically motivated by an analysis of the robustness of the predictions made by deep neural networks when their input is subject to an adversarial perturbation. The most important feature of Parseval networks is to maintain weight matrices of linear and convolutional layers to be (approximately) Parseval tight frames, which are extensions of orthogonal matrices to non-square matrices. We describe how these constraints can be maintained efficiently during SGD. We show that Parseval networks match the state-of-the-art in terms of accuracy on CIFAR-10/100 and Street View House Numbers (SVHN) while being more robust than their vanilla counterpart against adversarial examples. Incidentally, Parseval networks also tend to train faster and make a better usage of the full capacity of the networks.

This paper has not been read by Pith yet.

discussion (0)

Forward citations

Cited by 1 Pith paper

Reviewed papers in the Pith corpus that reference this work. Sorted by Pith novelty score.

Layer-wise Derivative Controlled Networks
cs.LG 2026-05 unverdicted novelty 4.0

ChainzRule with DREG regularization claims 15.5x fewer parameters than standard models, 23.1% lower peak gradient volatility on MNIST, and 70.17% accuracy on Yelp Full ordinal regression.