http://yann

The MNIST database of handwritten digits , author=

7 Pith papers cite this work. Polarity classification is still indexing.

7 Pith papers citing it

browse 7 citing papers

representative citing papers

cs.LG · 2018-11-27 · unverdicted · novelty 8.0

Dataset distillation creates a tiny synthetic training set that, when used with a fixed network initialization, produces models whose performance approximates that of models trained on the full original dataset.

Continual Learning of Domain-Invariant Representations

cs.LG · 2026-05-15 · unverdicted · novelty 7.0

Introduces replay-based continual learning with sequential invariance alignment to learn domain-invariant representations, outperforming baselines on generalization to unseen domains across six datasets in vision, medicine, manufacturing, and ecology.

Fixed-Point Neural Optimal Transport without Implicit Differentiation

math.OC · 2026-05-11 · unverdicted · novelty 7.0

A single-network fixed-point formulation for neural optimal transport eliminates adversarial min-max optimization and implicit differentiation while enforcing dual feasibility exactly.

On the Stability and Generalization of First-order Bilevel Minimax Optimization

cs.LG · 2026-04-22 · unverdicted · novelty 7.0

Provides the first systematic generalization analysis via algorithmic stability for single-timescale and two-timescale stochastic gradient descent-ascent in bilevel minimax problems.

Response-Conditioned Parallel-to-Sequential Orchestration for Multi-Agent Systems

cs.CL · 2026-05-15 · unverdicted · novelty 6.0

Nexa learns a response-conditioned policy that starts with parallel agent execution and adds at most one round of sequential message passing via a predicted sparse DAG, strictly subsuming pure parallel mode.

Learning Dynamics of Zeroth-Order Optimization: A Kernel Perspective

cs.LG · 2026-05-05 · unverdicted · novelty 6.0

Zeroth-order SGD learning dynamics are governed by a random low-dimensional projection of the empirical NTK whose approximation error scales with model output dimension, not parameter count.

Possibilistic Predictive Uncertainty for Deep Learning

cs.LG · 2026-05-01 · unverdicted · novelty 6.0

DAPPr introduces a possibilistic framework that projects parameter posteriors to predictions via supremum and approximates them with Dirichlet possibility functions to yield efficient, closed-form epistemic uncertainty estimates.

citing papers explorer

Showing 7 of 7 citing papers.

Dataset Distillation cs.LG · 2018-11-27 · unverdicted · none · ref 51
Dataset distillation creates a tiny synthetic training set that, when used with a fixed network initialization, produces models whose performance approximates that of models trained on the full original dataset.
Continual Learning of Domain-Invariant Representations cs.LG · 2026-05-15 · unverdicted · none · ref 81
Introduces replay-based continual learning with sequential invariance alignment to learn domain-invariant representations, outperforming baselines on generalization to unseen domains across six datasets in vision, medicine, manufacturing, and ecology.
Fixed-Point Neural Optimal Transport without Implicit Differentiation math.OC · 2026-05-11 · unverdicted · none · ref 168
A single-network fixed-point formulation for neural optimal transport eliminates adversarial min-max optimization and implicit differentiation while enforcing dual feasibility exactly.
On the Stability and Generalization of First-order Bilevel Minimax Optimization cs.LG · 2026-04-22 · unverdicted · none · ref 95
Provides the first systematic generalization analysis via algorithmic stability for single-timescale and two-timescale stochastic gradient descent-ascent in bilevel minimax problems.
Response-Conditioned Parallel-to-Sequential Orchestration for Multi-Agent Systems cs.CL · 2026-05-15 · unverdicted · none · ref 75
Nexa learns a response-conditioned policy that starts with parallel agent execution and adds at most one round of sequential message passing via a predicted sparse DAG, strictly subsuming pure parallel mode.
Learning Dynamics of Zeroth-Order Optimization: A Kernel Perspective cs.LG · 2026-05-05 · unverdicted · none · ref 47
Zeroth-order SGD learning dynamics are governed by a random low-dimensional projection of the empirical NTK whose approximation error scales with model output dimension, not parameter count.
Possibilistic Predictive Uncertainty for Deep Learning cs.LG · 2026-05-01 · unverdicted · none · ref 6
DAPPr introduces a possibilistic framework that projects parameter posteriors to predictions via supremum and approximates them with Dirichlet possibility functions to yield efficient, closed-form epistemic uncertainty estimates.

http://yann

fields

years

verdicts

representative citing papers

citing papers explorer