A new dataset-level non-strict symmetry measure allows deriving bounded equivariance for restoration models and motivates an adaptive network that aligns with per-sample symmetry to reduce expected risk.
hub
A pac-bayesian approach to spectrally- normalized margin bounds for neural networks.arXiv preprint arXiv:1707.09564
12 Pith papers cite this work. Polarity classification is still indexing.
abstract
We present a generalization bound for feedforward neural networks in terms of the product of the spectral norm of the layers and the Frobenius norm of the weights. The generalization bound is derived using a PAC-Bayes analysis.
hub tools
citation-role summary
citation-polarity summary
verdicts
UNVERDICTED 12representative citing papers
A nonasymptotic generalization error upper bound for path-regularized multilayer neural networks with Lipschitz losses that exhibits double descent and is near-minimax optimal for ReLU regression.
Derives adaptive generalization bounds {c_m / N^{1/(2∨m)}} for digital ML models via new concentration of measure results on finite metric spaces, with c_m = O(sqrt(m)).
Gromov-Wasserstein distance between modalities provides a stronger, inference-only predictor of final VLM performance than conventional encoder metrics, backed by theory linking it to cross-modal learnability and verified across 60+ training runs.
A certified adaptive quadrature framework computes guaranteed L^p, W^{1,p}, and W^{2,p} norms of deep neural networks by propagating interval enclosures on axis-aligned boxes.
Derives algorithm-dependent generalization bounds for neural nets using multilevel entropic regularization and proposes a Metropolis-simulated multi-scale Gibbs training procedure tested on a two-layer net for MNIST.
Depth expansion in normalized residual networks yields provable test-risk improvement through representational, optimization, and generalization gains under first-order descent and norm-control conditions.
Introduces a margin-adaptive confidence ranking method that learns an estimator from simulated diversity and derives margin-dependent generalization bounds for use in fixed-sequence testing of LLM-human agreement.
Provides Hessian-based theoretical characterizations of SGD dynamics and a scale-invariant generalization bound for deep nets, backed by experiments on synthetic data, MNIST, and CIFAR-10.
Sparse MLPs trained via SET plus neuron pruning achieve competitive performance on 15 datasets while pruning ~50% of hidden neurons and keeping parameter count linear in neuron count.
FedInit uses reverse personalized initialization in FL to reduce client drift effects, showing via excess risk that inconsistency impacts generalization error more than optimization error.
Generalization error bounds of order O(n^{-1/2}) (dimension-free) are derived for two-layer neural networks with Lipschitz losses under independent test data, and O(n^{-1/(d_in + d_out)}) without independence, using Wasserstein distances and SGD moment bounds.
citing papers explorer
-
Aligning Network Equivariance with Data Symmetry: A Theoretical Framework and Adaptive Approach for Image Restoration
A new dataset-level non-strict symmetry measure allows deriving bounded equivariance for restoration models and motivates an adaptive network that aligns with per-sample symmetry to reduce expected risk.
-
Path Regularization: A Near-Complete and Optimal Nonasymptotic Generalization Theory for Multilayer Neural Networks and Double Descent Phenomenon
A nonasymptotic generalization error upper bound for path-regularized multilayer neural networks with Lipschitz losses that exhibits double descent and is near-minimax optimal for ReLU regression.
-
Tighter Learning Guarantees on Digital Computers via Concentration of Measure on Finite Spaces
Derives adaptive generalization bounds {c_m / N^{1/(2∨m)}} for digital ML models via new concentration of measure results on finite metric spaces, with c_m = O(sqrt(m)).
-
Rethinking Model Selection in VLM Through the Lens of Gromov-Wasserstein Distance
Gromov-Wasserstein distance between modalities provides a stronger, inference-only predictor of final VLM performance than conventional encoder metrics, backed by theory linking it to cross-modal learnability and verified across 60+ training runs.
-
Certified and accurate computation of function space norms of deep neural networks
A certified adaptive quadrature framework computes guaranteed L^p, W^{1,p}, and W^{2,p} norms of deep neural networks by propagating interval enclosures on axis-aligned boxes.
-
Chaining Meets Chain Rule: Multilevel Entropic Regularization and Training of Neural Nets
Derives algorithm-dependent generalization bounds for neural nets using multilevel entropic regularization and proposes a Metropolis-simulated multi-scale Gibbs training procedure tested on a two-layer net for MNIST.
-
A Qualitative Test-Risk Mechanism for Scaling Behavior in Normalized Residual Networks
Depth expansion in normalized residual networks yields provable test-risk improvement through representational, optimization, and generalization gains under first-order descent and norm-control conditions.
-
Margin-Adaptive Confidence Ranking for Reliable LLM Judgement
Introduces a margin-adaptive confidence ranking method that learns an estimator from simulated diversity and derives margin-dependent generalization bounds for use in fixed-sequence testing of LLM-human agreement.
-
Hessian based analysis of SGD for Deep Nets: Dynamics and Generalization
Provides Hessian-based theoretical characterizations of SGD dynamics and a scale-invariant generalization bound for deep nets, backed by experiments on synthetic data, MNIST, and CIFAR-10.
-
On improving deep learning generalization with adaptive sparse connectivity
Sparse MLPs trained via SET plus neuron pruning achieve competitive performance on 15 datasets while pruning ~50% of hidden neurons and keeping parameter count linear in neuron count.
-
Rethinking the Personalized Relaxed Initialization in the Federated Learning: Consistency and Generalization
FedInit uses reverse personalized initialization in FL to reduce client drift effects, showing via excess risk that inconsistency impacts generalization error more than optimization error.
-
Generalization error bounds for two-layer neural networks with Lipschitz loss function
Generalization error bounds of order O(n^{-1/2}) (dimension-free) are derived for two-layer neural networks with Lipschitz losses under independent test data, and O(n^{-1/(d_in + d_out)}) without independence, using Wasserstein distances and SGD moment bounds.