archive

Every paper Pith has read. Search by title, abstract, or pith.

2684 papers in stat.ML · page 12

cs.LG 2026-05-07 reviewed

Low precision triggers slingshot loss spikes via feature inflation
Grokking or Glitching? How Low-Precision Drives Slingshot Loss Spikes

Liu Hanqing +3
stat.ML 2026-05-07 reviewed

Kernel embeddings define Gaussian mixtures for Hilbert space data
Gaussian mixture models in Hilbert spaces via kernel methods

Daniel L\'opez-Montero +2
stat.ML 2026-05-07 reviewed

TabCF turns tabular models into fast control function estimators
TabCF: Distributional Control Function Estimation with Tabular Foundation Models

Geping Chen +4
stat.ML 2026-05-07 reviewed

Repeated splits fix winner's curse in LLM adaptive benchmarks
Towards Reliable LLM Evaluation: Correcting the Winner's Curse in Adaptive Benchmarking

Yang Xu +5
cs.LG 2026-05-07 reviewed

This paper proves that for many kernels
Sharper Guarantees for Misspecified Kernelized Bandit Optimization

Davide Maran +1
stat.ML 2026-05-07 reviewed

Derivatives define causal fairness for continuous attributes
Tuning Derivatives for Causal Fairness in Machine Learning

Filip Edstr\"om +3
stat.ML 2026-05-07 reviewed

CITE certifies target answers as LLM response modes with anytime-valid guarantees
CITE: Anytime-Valid Statistical Inference in LLM Self-Consistency

Hirofumi Ota +4
stat.ME 2026-05-07 reviewed

Kernel copula embeddings detect causal dependence shifts
Detecting Changes in Causal Dependence with Kernels and Copulas

Shakeel Gavioli-Akilagun +2
stat.ML 2026-05-07 reviewed

Ratio-based losses track relative errors via y over f(x)
Ratio-based Loss Functions

Lena Helgerth +1
math.ST 2026-05-07 reviewed

Kernel gradient flows match minimax uniform rates
Optimal Confidence Band for Kernel Gradient Flow Estimator

Yuqian Cheng +2
stat.ML 2026-05-07 reviewed

Transformers execute RL policy updates from context alone
Transformers Provably Implement In-Context Reinforcement Learning with Policy Improvement

Haodong Liang +1
stat.ML 2026-05-07 reviewed

Fourier features scale nonlinear causal discovery to mixed data
Fourier Feature Methods for Nonlinear Causal Discovery: FFML Scoring, TRFF Scoring, and FFCI Testing in Mixed Data

Joseph D. Ramsey
stat.ML 2026-05-07 reviewed

Fourier features scale GP scoring and CI tests for nonlinear causal discovery
Fourier Feature Methods for Nonlinear Causal Discovery: FFML Scoring, TRFF Scoring, and FFCI Testing in Mixed Data

Joseph D. Ramsey
math.NA 2026-05-07 reviewed

Convex hulls give O(d/N) error for positive kernel quadrature
Convex-Geometric Error Bounds for Positive-Weight Kernel Quadrature

Satoshi Hayakawa
cs.LG 2026-05-07 reviewed

KAN spline removal degrades time-series forecasts
Temporal Functional Circuits: From Spline Plots to Faithful Explanations in KAN Forecasting

Naveen Mysore
stat.ML 2026-05-07 reviewed

Early spectra forecast token efficiency in LLM training
Spectral Lens: Activation and Gradient Spectra as Diagnostics of LLM Optimization

Andy Zeyi Liu +2
stat.ML 2026-05-07 reviewed

vMF spherical flows generate categorical data from posterior alone
Spherical Flows for Sampling Categorical Data

Jannis Chemseddine +2
stat.ML 2026-05-07 reviewed

Spherical vMF flow reduces categorical sampling to scalar ODE
Spherical Flows for Sampling Categorical Data

Jannis Chemseddine +2
cs.LG 2026-05-07 reviewed

Residual from spectral analysis flags grokking transitions early
Distributional Spectral Diagnostics for Localizing Grokking Transitions

Ziyue Wang +2
cs.LG 2026-05-07 reviewed

This paper develops a new algorithm for setting prices dynamically when customer demand…
Optimal Contextual Pricing under Agnostic Non-Lipschitz Demand

Jianyu Xu +1
stat.ML 2026-05-07 reviewed

Neural score from backward PDE defines posterior SDE for sparse smoothing
Variational Smoothing and Inference for SDEs from Sparse Data with Dynamic Neural Flows

Yu Wang +1
stat.ML 2026-05-07 reviewed

Pretrained transformer solves PU classification in one forward pass
In-Context Positive-Unlabeled Learning

Siyan Liu +4
stat.ML 2026-05-07 reviewed

Relaxed Cholesky method scales causal discovery to 10k variables
Relaxed Sparsest-Permutation Formulation for Causal Discovery at Scale

Sunmin Oh +2
stat.ML 2026-05-06 reviewed

Symmetry-aware nets learn non-stationary GP kernels scalably
Permutation-preserving Functions and Neural Vecchia Covariance Kernels

Jian Cao +2
cs.LG 2026-05-06 reviewed

Diffusion priors sharpen rain maps from microwave links
Bayesian Rain Field Reconstruction using Commercial Microwave Links and Diffusion Model Priors

Badr Moufad +6
cs.LG 2026-05-06 reviewed

Pathwise gradients optimize non-myopic feature acquisition
Non-Myopic Active Feature Acquisition via Pathwise Policy Gradients

Linus Aronsson +1
cs.LG 2026-05-06 reviewed

Linear attribution methods share one canonical form
GRALIS: A Unified Canonical Framework for Linear Attribution Methods via Riesz Representation

Raimondo Fanale
cs.LG 2026-05-06 reviewed

Linear attribution methods share one canonical form
GRALIS: A Unified Canonical Framework for Linear Attribution Methods via Riesz Representation

Raimondo Fanale
stat.ML 2026-05-06 reviewed

Benign regularizer exposes hidden local convexity in nonconvex matrix estimation
Convexity in Disguise: A Theoretical Framework for Nonconvex Low-Rank Matrix Estimation

Chengyu Cui +1
stat.ML 2026-05-06 reviewed

Gradient matching recovers hidden penalties in neural net training
Estimating Implicit Regularization in Deep Learning

Joseph H. Rudoler +3
math.ST 2026-05-06 reviewed

Direct estimator gives finite-sample bounds for Schr odinger bridge drifts
Direct Estimation of Schr\"odinger Bridge Time-Series Drifts: Finite-Sample, Asymptotic, and Adaptive Guarantees

Othmane Mazhar +1
cs.LG 2026-05-06 reviewed

Adaptive elastic nets fix feature starvation in sparse autoencoders
Feature Starvation as Geometric Instability in Sparse Autoencoders

Faris Chaudhry +2
stat.ML 2026-05-06 reviewed

Linear memory reaches n capacity with listwise but only n/log n with top-1
Sharp Capacity Thresholds in Linear Associative Memory: From Winner-Take-All to Listwise Retrieval

Nicholas Barnfield +4
cs.LG 2026-05-06 reviewed

Cumulant approximations estimate wide MLP outputs with fewer FLOPs
Estimating the expected output of wide random MLPs more efficiently than sampling

Wilson Wu +5
cs.LG 2026-05-06 reviewed

Wide random MLPs yield expected outputs without sampling
Estimating the expected output of wide random MLPs more efficiently than sampling

Wilson Wu +5
cs.LG 2026-05-06 reviewed

Drifting models match Wasserstein flow fixed points on KL
On the Wasserstein Gradient Flow Interpretation of Drifting Models

Arthur Gretton +5
cs.LG 2026-05-06 reviewed

GMD targets fixed points of Wasserstein gradient flows
On the Wasserstein Gradient Flow Interpretation of Drifting Models

Arthur Gretton +5
cs.LG 2026-05-06 reviewed

Distributional regret bounds unify bandits and episodic RL
Unified Framework of Distributional Regret in Multi-Armed Bandits and Reinforcement Learning

Harin Lee +1
cs.GR 2026-05-06 reviewed

Bayesian view selection cuts scans needed for task-specific 3D models
A Bayesian Approach for Task-Specific Next-Best-View Selection with Uncertain Geometry

Jingsen Zhu +2
stat.ML 2026-05-06 reviewed

Decomposing coefficients by graph nodes yields stable doubly sparse regression
Proximal Projection for Doubly Sparse Regularized Models

Jia Wei He +2
math.ST 2026-05-06 reviewed

High-dimensional statistics connects to optimization and random matrices
High-Dimensional Statistics: Reflections on Progress and Open Problems

Arian Maleki +11
stat.ML 2026-05-06 reviewed

Diffusion on incidence matrices generates better hypergraphs
Hypergraph Generation via Structured Stochastic Diffusion

Christopher Nemeth
stat.ML 2026-05-06 reviewed

MDL method finds spatial regions and their time series drivers
Scalable inference of spatial regions and temporal signatures from time series

Jiayu Weng +1
cs.LG 2026-05-06 reviewed

Adaptivity advantages shift under ReLU realizability
Adaptivity Under Realizability Constraints: Comparing In-Context and Agentic Learning

Anastasis Kratsios +2
cs.LG 2026-05-06 reviewed

Batch normalization refines local affine partitions during training
Training-Time Batch Normalization Reshapes Local Partition Geometry in Piecewise-Affine Networks

Xuan Qi +5
cs.LG 2026-05-06 reviewed

BN recenters hyperplanes to refine local partitions in networks
Training-Time Batch Normalization Reshapes Local Partition Geometry in Piecewise-Affine Networks

Xuan Qi +5
stat.ML 2026-05-06 reviewed

Directional energy in drift subspace bounds frozen predictor risk
Jacobian-Velocity Bounds for Deployment Risk Under Covariate Drift

Jonathan R. Landers
cs.LG 2026-05-06 reviewed

Causal GRN methods beat correlations only in clean data
When Does Gene Regulatory Network Inference Break? A Controlled Diagnostic Study of Causal and Correlational Methods on Single-Cell Data

Miguel Fernandez-de-Retana +4
cs.LG 2026-05-06 reviewed

Regime score flips Bayesian optimization winners across budgets
Regime-Conditioned Evaluation in Multi-Context Bayesian Optimization

Noel Thomas
cs.LG 2026-05-06 reviewed

Symmetric attention diagnostics miss flow direction
Self-Attention as Transport: Limits of Symmetric Spectral Diagnostics

Dominik Dahlem +2