Dataset distillation creates a tiny synthetic training set that, when used with a fixed network initialization, produces models whose performance approximates that of models trained on the full original dataset.
http://yann
7 Pith papers cite this work. Polarity classification is still indexing.
verdicts
UNVERDICTED 7representative citing papers
Introduces replay-based continual learning with sequential invariance alignment to learn domain-invariant representations, outperforming baselines on generalization to unseen domains across six datasets in vision, medicine, manufacturing, and ecology.
A single-network fixed-point formulation for neural optimal transport eliminates adversarial min-max optimization and implicit differentiation while enforcing dual feasibility exactly.
Provides the first systematic generalization analysis via algorithmic stability for single-timescale and two-timescale stochastic gradient descent-ascent in bilevel minimax problems.
Nexa learns a response-conditioned policy that starts with parallel agent execution and adds at most one round of sequential message passing via a predicted sparse DAG, strictly subsuming pure parallel mode.
Zeroth-order SGD learning dynamics are governed by a random low-dimensional projection of the empirical NTK whose approximation error scales with model output dimension, not parameter count.
DAPPr introduces a possibilistic framework that projects parameter posteriors to predictions via supremum and approximates them with Dirichlet possibility functions to yield efficient, closed-form epistemic uncertainty estimates.
citing papers explorer
-
Dataset Distillation
Dataset distillation creates a tiny synthetic training set that, when used with a fixed network initialization, produces models whose performance approximates that of models trained on the full original dataset.
-
Continual Learning of Domain-Invariant Representations
Introduces replay-based continual learning with sequential invariance alignment to learn domain-invariant representations, outperforming baselines on generalization to unseen domains across six datasets in vision, medicine, manufacturing, and ecology.
-
Fixed-Point Neural Optimal Transport without Implicit Differentiation
A single-network fixed-point formulation for neural optimal transport eliminates adversarial min-max optimization and implicit differentiation while enforcing dual feasibility exactly.
-
On the Stability and Generalization of First-order Bilevel Minimax Optimization
Provides the first systematic generalization analysis via algorithmic stability for single-timescale and two-timescale stochastic gradient descent-ascent in bilevel minimax problems.
-
Response-Conditioned Parallel-to-Sequential Orchestration for Multi-Agent Systems
Nexa learns a response-conditioned policy that starts with parallel agent execution and adds at most one round of sequential message passing via a predicted sparse DAG, strictly subsuming pure parallel mode.
-
Learning Dynamics of Zeroth-Order Optimization: A Kernel Perspective
Zeroth-order SGD learning dynamics are governed by a random low-dimensional projection of the empirical NTK whose approximation error scales with model output dimension, not parameter count.
-
Possibilistic Predictive Uncertainty for Deep Learning
DAPPr introduces a possibilistic framework that projects parameter posteriors to predictions via supremum and approximates them with Dirichlet possibility functions to yield efficient, closed-form epistemic uncertainty estimates.