pith. machine review for the scientific record. sign in

arxiv: 1903.08850 · v2 · submitted 2019-03-21 · 📊 stat.ML · cs.LG· cs.NE

Recognition: unknown

Stochastic Optimization of Sorting Networks via Continuous Relaxations

Authors on Pith no claims yet
classification 📊 stat.ML cs.LGcs.NE
keywords sortingoptimizationrelaxationcontinuousgradient-basedlearningmatricesobjects
0
0 comments X
read the original abstract

Sorting input objects is an important step in many machine learning pipelines. However, the sorting operator is non-differentiable with respect to its inputs, which prohibits end-to-end gradient-based optimization. In this work, we propose NeuralSort, a general-purpose continuous relaxation of the output of the sorting operator from permutation matrices to the set of unimodal row-stochastic matrices, where every row sums to one and has a distinct arg max. This relaxation permits straight-through optimization of any computational graph involve a sorting operation. Further, we use this relaxation to enable gradient-based stochastic optimization over the combinatorially large space of permutations by deriving a reparameterized gradient estimator for the Plackett-Luce family of distributions over permutations. We demonstrate the usefulness of our framework on three tasks that require learning semantic orderings of high-dimensional objects, including a fully differentiable, parameterized extension of the k-nearest neighbors algorithm.

This paper has not been read by Pith yet.

discussion (0)

Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.

Forward citations

Cited by 2 Pith papers

Reviewed papers in the Pith corpus that reference this work. Sorted by Pith novelty score.

  1. SWAN: World-Aware Adaptive Multimodal Networks for Runtime Variations

    cs.LG 2026-04 unverdicted novelty 6.0

    SWAN is the first adaptive multimodal network that meets variable compute budgets, optimizes layer use by sample complexity, and drops irrelevant features, cutting FLOPs up to 49% in 3D object detection with minimal a...

  2. Beyond Dense Connectivity: Explicit Sparsity for Scalable Recommendation

    cs.IR 2026-04 unverdicted novelty 5.0

    SSR uses static random filters and iterative competitive sparse mechanisms to explicitly enforce sparsity in recommendation models, outperforming dense baselines on public and billion-scale industrial datasets.