Spectral Norm Regularization for Improving the Generalizability of Deep Learning

Takeru Miyato; Yuichi Yoshida

arxiv: 1705.10941 · v1 · pith:SIGUNBWLnew · submitted 2017-05-31 · 📊 stat.ML · cs.LG

Spectral Norm Regularization for Improving the Generalizability of Deep Learning

Yuichi Yoshida , Takeru Miyato This is my paper

classification 📊 stat.ML cs.LG

keywords normregularizationspectralgeneralizabilityperturbationsensitivitydeephigh

0 comments

read the original abstract

We investigate the generalizability of deep learning based on the sensitivity to input perturbation. We hypothesize that the high sensitivity to the perturbation of data degrades the performance on it. To reduce the sensitivity to perturbation, we propose a simple and effective regularization method, referred to as spectral norm regularization, which penalizes the high spectral norm of weight matrices in neural networks. We provide supportive evidence for the abovementioned hypothesis by experimentally confirming that the models trained using spectral norm regularization exhibit better generalizability than other baseline methods.

This paper has not been read by Pith yet.

discussion (0)

Forward citations

Cited by 10 Pith papers

Reviewed papers in the Pith corpus that reference this work. Sorted by Pith novelty score.

Gradient-Based Program Synthesis with Neurally Interpreted Languages
cs.LG 2026-04 unverdicted novelty 8.0

NLI autonomously discovers a vocabulary of primitive operations and interprets variable-length programs via a neural executor, allowing end-to-end training and gradient-based test-time adaptation that outperforms prio...
Navigating Potholes with Geometry-Aware Sharpness Minimization
cs.LG 2026-05 unverdicted novelty 7.0

LLQR+SAM pairs a slow learned geometry preconditioner with fast SAM perturbations to amplify escape from locally sharp 'potholes' while stabilizing flat basins, producing consistent gains over SAM and LLQR alone.
Accelerating Inference for Multilayer Neural Networks with Quantum Computers
quant-ph 2025-10 unverdicted novelty 7.0

Quantum circuits for coherent multilayer neural network inference achieve quadratic to polylogarithmic speedups over classical methods depending on quantum data access models for inputs and weights.
When to Stop Reusing: Dynamic Gradient Gating for Sample-Efficient RLVR
cs.LG 2026-05 unverdicted novelty 6.0

Dynamic Gradient Gating monitors lm_head gradient norms to safely reuse rollout batches in RLVR, achieving up to 2.93x sample efficiency and 2.14x wall-clock speedup across math, ALFWorld, WebShop, and QA tasks.
Jellyfish: Zero-Shot Federated Unlearning Scheme with Knowledge Disentanglement
cs.CR 2026-04 unverdicted novelty 6.0

Jellyfish enables zero-shot federated unlearning through synthetic proxy data generation, channel-restricted knowledge disentanglement, and a composite loss with repair to forget target data while retaining model utility.
Upper Generalization Bounds for Neural Oscillators
cs.LG 2026-03 conditional novelty 6.0

Upper generalization bounds for neural oscillators scale polynomially with MLP size and time length, avoiding the curse of parametric complexity, with numerical validation on a Bouc-Wen nonlinear system.
ReachNN: Reachability Analysis of Neural-Network Controlled Systems
eess.SY 2019-06 unverdicted novelty 6.0

ReachNN abstracts feedforward neural networks with Bernstein polynomials and provides error bounds to compute reachable sets for verifying neural-network controlled systems with general Lipschitz-continuous activation...
Margin-Adaptive Confidence Ranking for Reliable LLM Judgement
cs.LG 2026-05 unverdicted novelty 5.0

Introduces a margin-adaptive confidence ranking method that learns an estimator from simulated diversity and derives margin-dependent generalization bounds for use in fixed-sequence testing of LLM-human agreement.
Pion: A Spectrum-Preserving Optimizer via Orthogonal Equivalence Transformation
cs.LG 2026-05 unverdicted novelty 5.0

Pion is an optimizer that preserves the singular values of weight matrices in LLM training by applying orthogonal equivalence transformations.
Mean Spectral Normalization of Deep Neural Networks for Embedded Automation
cs.LG 2019-07 unverdicted novelty 4.0

Proposes MSN reparameterization to address mean-drift in SN, claiming ~16% faster inference than BN with fewer parameters on CNNs and GANs.