A PAC-Bayesian Approach to Spectrally-Normalized Margin Bounds for Neural Networks
read the original abstract
We present a generalization bound for feedforward neural networks in terms of the product of the spectral norm of the layers and the Frobenius norm of the weights. The generalization bound is derived using a PAC-Bayes analysis.
This paper has not been read by Pith yet.
Forward citations
Cited by 12 Pith papers
-
Aligning Network Equivariance with Data Symmetry: A Theoretical Framework and Adaptive Approach for Image Restoration
A new dataset-level non-strict symmetry measure allows deriving bounded equivariance for restoration models and motivates an adaptive network that aligns with per-sample symmetry to reduce expected risk.
-
Rethinking Model Selection in VLM Through the Lens of Gromov-Wasserstein Distance
Gromov-Wasserstein distance between modalities provides a stronger, inference-only predictor of final VLM performance than conventional encoder metrics, backed by theory linking it to cross-modal learnability and veri...
-
Path Regularization: A Near-Complete and Optimal Nonasymptotic Generalization Theory for Multilayer Neural Networks and Double Descent Phenomenon
A nonasymptotic generalization error upper bound for path-regularized multilayer neural networks with Lipschitz losses that exhibits double descent and is near-minimax optimal for ReLU regression.
-
Tighter Learning Guarantees on Digital Computers via Concentration of Measure on Finite Spaces
Derives adaptive generalization bounds {c_m / N^{1/(2∨m)}} for digital ML models via new concentration of measure results on finite metric spaces, with c_m = O(sqrt(m)).
-
A Qualitative Test-Risk Mechanism for Scaling Behavior in Normalized Residual Networks
Depth expansion in normalized residual networks yields provable test-risk improvement through representational, optimization, and generalization gains under first-order descent and norm-control conditions.
-
Certified and accurate computation of function space norms of deep neural networks
A certified adaptive quadrature framework computes guaranteed L^p, W^{1,p}, and W^{2,p} norms of deep neural networks by propagating interval enclosures on axis-aligned boxes.
-
Chaining Meets Chain Rule: Multilevel Entropic Regularization and Training of Neural Nets
Derives algorithm-dependent generalization bounds for neural nets using multilevel entropic regularization and proposes a Metropolis-simulated multi-scale Gibbs training procedure tested on a two-layer net for MNIST.
-
Margin-Adaptive Confidence Ranking for Reliable LLM Judgement
Introduces a margin-adaptive confidence ranking method that learns an estimator from simulated diversity and derives margin-dependent generalization bounds for use in fixed-sequence testing of LLM-human agreement.
-
Rethinking the Personalized Relaxed Initialization in the Federated Learning: Consistency and Generalization
FedInit uses reverse personalized initialization in FL to reduce client drift effects, showing via excess risk that inconsistency impacts generalization error more than optimization error.
-
Generalization error bounds for two-layer neural networks with Lipschitz loss function
Generalization error bounds of order O(n^{-1/2}) (dimension-free) are derived for two-layer neural networks with Lipschitz losses under independent test data, and O(n^{-1/(d_in + d_out)}) without independence, using W...
-
Hessian based analysis of SGD for Deep Nets: Dynamics and Generalization
Provides Hessian-based theoretical characterizations of SGD dynamics and a scale-invariant generalization bound for deep nets, backed by experiments on synthetic data, MNIST, and CIFAR-10.
-
On improving deep learning generalization with adaptive sparse connectivity
Sparse MLPs trained via SET plus neuron pruning achieve competitive performance on 15 datasets while pruning ~50% of hidden neurons and keeping parameter count linear in neuron count.
discussion (0)
Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.