arXiv preprint arXiv:1903.08850 (2019)

Grover, A · 2019 · stat.ML · arXiv 1903.08850

4 Pith papers cite this work. Polarity classification is still indexing.

4 Pith papers citing it

open full Pith review browse 4 citing papers arXiv PDF

abstract

Sorting input objects is an important step in many machine learning pipelines. However, the sorting operator is non-differentiable with respect to its inputs, which prohibits end-to-end gradient-based optimization. In this work, we propose NeuralSort, a general-purpose continuous relaxation of the output of the sorting operator from permutation matrices to the set of unimodal row-stochastic matrices, where every row sums to one and has a distinct arg max. This relaxation permits straight-through optimization of any computational graph involve a sorting operation. Further, we use this relaxation to enable gradient-based stochastic optimization over the combinatorially large space of permutations by deriving a reparameterized gradient estimator for the Plackett-Luce family of distributions over permutations. We demonstrate the usefulness of our framework on three tasks that require learning semantic orderings of high-dimensional objects, including a fully differentiable, parameterized extension of the k-nearest neighbors algorithm.

representative citing papers

Learning Unbiased Permutations via Flow Matching

cs.LG · 2026-05-16 · unverdicted · novelty 7.0

PermFlow applies conditional flow matching on the affine subspace of doubly stochastic matrices with a closed-form tangent projector and nearest-target coupling to capture multimodal permutation distributions.

LeJEPA: Provable and Scalable Self-Supervised Learning Without the Heuristics

cs.LG · 2025-11-11 · conditional · novelty 6.0

LeJEPA derives an optimal isotropic Gaussian target for embeddings and enforces it via sketched regularization to deliver scalable, heuristics-free self-supervised pretraining with 79% ImageNet linear accuracy on ViT-H/14.

SWAN: World-Aware Adaptive Multimodal Networks for Runtime Variations

cs.LG · 2026-04-28 · unverdicted · novelty 6.0

SWAN is the first adaptive multimodal network that meets variable compute budgets, optimizes layer use by sample complexity, and drops irrelevant features, cutting FLOPs up to 49% in 3D object detection with minimal accuracy loss.

Beyond Dense Connectivity: Explicit Sparsity for Scalable Recommendation

cs.IR · 2026-04-09 · unverdicted · novelty 5.0

SSR uses static random filters and iterative competitive sparse mechanisms to explicitly enforce sparsity in recommendation models, outperforming dense baselines on public and billion-scale industrial datasets.

citing papers explorer

Showing 4 of 4 citing papers.

Learning Unbiased Permutations via Flow Matching cs.LG · 2026-05-16 · unverdicted · none · ref 13 · internal anchor
PermFlow applies conditional flow matching on the affine subspace of doubly stochastic matrices with a closed-form tangent projector and nearest-target coupling to capture multimodal permutation distributions.
LeJEPA: Provable and Scalable Self-Supervised Learning Without the Heuristics cs.LG · 2025-11-11 · conditional · none · ref 93 · internal anchor
LeJEPA derives an optimal isotropic Gaussian target for embeddings and enforces it via sketched regularization to deliver scalable, heuristics-free self-supervised pretraining with 79% ImageNet linear accuracy on ViT-H/14.
SWAN: World-Aware Adaptive Multimodal Networks for Runtime Variations cs.LG · 2026-04-28 · unverdicted · none · ref 10
SWAN is the first adaptive multimodal network that meets variable compute budgets, optimizes layer use by sample complexity, and drops irrelevant features, cutting FLOPs up to 49% in 3D object detection with minimal accuracy loss.
Beyond Dense Connectivity: Explicit Sparsity for Scalable Recommendation cs.IR · 2026-04-09 · unverdicted · none · ref 11
SSR uses static random filters and iterative competitive sparse mechanisms to explicitly enforce sparsity in recommendation models, outperforming dense baselines on public and billion-scale industrial datasets.

arXiv preprint arXiv:1903.08850 (2019)

fields

years

verdicts

representative citing papers

citing papers explorer