Neural LoFi models deep learning as layer-wise spectral filtering that selects maximal low-degree correlations, yielding a tractable surrogate for hierarchical representation learning beyond the lazy regime.
Learning multiple layers of features from tiny images
6 Pith papers cite this work. Polarity classification is still indexing.
citation-role summary
citation-polarity summary
verdicts
UNVERDICTED 6roles
dataset 2polarities
use dataset 2representative citing papers
A new framework is introduced for end-to-end provable robustness against backdoor attacks by composing randomized smoothing with differentially private training via privacy profiles.
High accuracy in noisy-label learning does not guarantee OOD detection reliability due to uncertainty collapse, and Virtual Margin Regularization offers partial mitigation.
Bayesian Model Merging introduces a bi-level optimization framework that merges task-specific models via closed-form Bayesian regression with an anchor prior and global hyperparameter search, outperforming baselines and nearly matching expert averages on up to 20-task vision and 5-task language Merg
Norm-matched zeroth-order adaptation preserves the isotropic retention floor while contracting only the anisotropic component, producing a quadratic forgetting gap that favors ZO precisely when the first-order direction has above-average retention curvature.
Data augmentation enables CNNs to adapt to varying architectures and data amounts without hyperparameter fine-tuning, unlike weight decay and dropout.
citing papers explorer
-
Deep Learning as Neural Low-Degree Filtering: A Spectral Theory of Hierarchical Feature Learning
Neural LoFi models deep learning as layer-wise spectral filtering that selects maximal low-degree correlations, yielding a tractable surrogate for hierarchical representation learning beyond the lazy regime.
-
Why Zeroth-Order Adaptation May Forget Less: A Randomized Shaping Theory
Norm-matched zeroth-order adaptation preserves the isotropic retention floor while contracting only the anisotropic component, producing a quadratic forgetting gap that favors ZO precisely when the first-order direction has above-average retention curvature.