Dataset distillation creates a tiny synthetic training set that, when used with a fixed network initialization, produces models whose performance approximates that of models trained on the full original dataset.
Title resolution pending
5 Pith papers cite this work. Polarity classification is still indexing.
representative citing papers
HFF replaces binary goodness-of-fit in Forward-Forward with hyperspherical prototypes for direct multi-class decisions, enabling single-forward-pass inference and training that scales to ImageNet while closing much of the gap to backpropagation.
PTA adapts VLMs at test time by maintaining and updating class-specific knowledge prototypes from test samples, achieving higher accuracy than cache-based methods with far less speed loss.
Intermediate layers in LLMs consistently provide stronger features than final layers across tasks and architectures, as quantified by a new framework of information-theoretic, geometric, and invariance metrics.
citing papers explorer
-
Dataset Distillation
Dataset distillation creates a tiny synthetic training set that, when used with a fixed network initialization, produces models whose performance approximates that of models trained on the full original dataset.
-
Hyperspherical Forward-Forward with Prototypical Representations
HFF replaces binary goodness-of-fit in Forward-Forward with hyperspherical prototypes for direct multi-class decisions, enabling single-forward-pass inference and training that scales to ImageNet while closing much of the gap to backpropagation.
-
Prototype-Based Test-Time Adaptation of Vision-Language Models
PTA adapts VLMs at test time by maintaining and updating class-specific knowledge prototypes from test samples, achieving higher accuracy than cache-based methods with far less speed loss.
-
Layer by Layer: Uncovering Hidden Representations in Language Models
Intermediate layers in LLMs consistently provide stronger features than final layers across tasks and architectures, as quantified by a new framework of information-theoretic, geometric, and invariance metrics.
- SURGE: Surrogate Gradient Adaptation in Binary Neural Networks