Dataset distillation creates a tiny synthetic training set that, when used with a fixed network initialization, produces models whose performance approximates that of models trained on the full original dataset.
http://yann
7 Pith papers cite this work. Polarity classification is still indexing.
verdicts
UNVERDICTED 7representative citing papers
Introduces replay-based continual learning with sequential invariance alignment to learn domain-invariant representations, outperforming baselines on generalization to unseen domains across six datasets in vision, medicine, manufacturing, and ecology.
A single-network implicit neural optimal transport method that solves the c-transform via proximal fixed-point iteration for stable, non-adversarial training.
Provides the first systematic generalization analysis via algorithmic stability for single-timescale and two-timescale stochastic gradient descent-ascent in bilevel minimax problems.
Nexa learns a response-conditioned policy that starts with parallel agent execution and adds at most one round of sequential message passing via a predicted sparse DAG, strictly subsuming pure parallel mode.
Zeroth-order SGD learning dynamics are governed by a random low-dimensional projection of the empirical NTK whose approximation error scales with model output dimension, not parameter count.
DAPPr projects a possibilistic posterior over network parameters to predictions using supremum operators and approximates it with learnable Dirichlet functions to yield an efficient training objective for epistemic uncertainty.
citing papers explorer
-
Dataset Distillation
Dataset distillation creates a tiny synthetic training set that, when used with a fixed network initialization, produces models whose performance approximates that of models trained on the full original dataset.
-
Continual Learning of Domain-Invariant Representations
Introduces replay-based continual learning with sequential invariance alignment to learn domain-invariant representations, outperforming baselines on generalization to unseen domains across six datasets in vision, medicine, manufacturing, and ecology.
-
Implicit Neural Optimal Transport via Fixed-Point Optimization
A single-network implicit neural optimal transport method that solves the c-transform via proximal fixed-point iteration for stable, non-adversarial training.
-
On the Stability and Generalization of First-order Bilevel Minimax Optimization
Provides the first systematic generalization analysis via algorithmic stability for single-timescale and two-timescale stochastic gradient descent-ascent in bilevel minimax problems.
-
Response-Conditioned Parallel-to-Sequential Orchestration for Multi-Agent Systems
Nexa learns a response-conditioned policy that starts with parallel agent execution and adds at most one round of sequential message passing via a predicted sparse DAG, strictly subsuming pure parallel mode.
-
Learning Dynamics of Zeroth-Order Optimization: A Kernel Perspective
Zeroth-order SGD learning dynamics are governed by a random low-dimensional projection of the empirical NTK whose approximation error scales with model output dimension, not parameter count.
-
Possibilistic Predictive Uncertainty for Deep Learning
DAPPr projects a possibilistic posterior over network parameters to predictions using supremum operators and approximates it with learnable Dirichlet functions to yield an efficient training objective for epistemic uncertainty.