FlashSinkhorn delivers up to 32x forward and 161x end-to-end speedups for entropic OT on A100 GPUs via IO-aware Triton kernels that fuse log-domain updates and streaming transport application.
Deep learning via hessian-free optimization
3 Pith papers cite this work. Polarity classification is still indexing.
citation-role summary
citation-polarity summary
years
2026 3roles
background 1polarities
background 1representative citing papers
Super-Linear Advantage Shaping (SLAS) introduces a non-linear geometric policy update for RL post-training of text-to-image models that reshapes the local policy space via advantage-dependent Fisher-Rao weighting to reduce reward hacking and improve performance over GRPO baselines.
A new regularized Hessian-free Newton-type method for smooth convex optimization achieves global O(k^{-2}) convergence and local quadratic convergence in a variant, with practical speedups over prior methods.
citing papers explorer
-
FlashSinkhorn: IO-Aware Entropic Optimal Transport on GPU
FlashSinkhorn delivers up to 32x forward and 161x end-to-end speedups for entropic OT on A100 GPUs via IO-aware Triton kernels that fuse log-domain updates and streaming transport application.
-
Power Reinforcement Post-Training of Text-to-Image Models with Super-Linear Advantage Shaping
Super-Linear Advantage Shaping (SLAS) introduces a non-linear geometric policy update for RL post-training of text-to-image models that reshapes the local policy space via advantage-dependent Fisher-Rao weighting to reduce reward hacking and improve performance over GRPO baselines.
-
A Regularized Hessian-Free Inexact Newton-Type Method with Global $\mathcal{O}(k^{-2})$ Convergence
A new regularized Hessian-free Newton-type method for smooth convex optimization achieves global O(k^{-2}) convergence and local quadratic convergence in a variant, with practical speedups over prior methods.