archive

Every paper Pith has read. Search by title, abstract, or pith.

2684 papers in stat.ML · page 8

math.AT 2026-05-11 reviewed

Stable barcodes track how dependency clusters evolve in dynamic Bayesian networks
A Stable Distance Persistence Homology for Dynamic Bayesian Network Clustering

Will Bales +1
stat.ML 2026-05-11 reviewed

Thompson sampling learns unknown networks while optimizing treatments
Adaptive Policy Learning Under Unknown Network Interference

Aidan Gleich +2
cs.LG 2026-05-11 reviewed

Random spectra match Muon on GPT-2 training
Muon is Not That Special: Random or Inverted Spectra Work Just as Well

Zakhar Shumaylov +8
stat.ML 2026-05-11 reviewed

Kernel makes rotated 3D anisotropy explicit in Gaussian processes
Interpretable Machine Learning for Spatial Science: A Lie-Algebraic Kernel for Rotationally Anisotropic Gaussian Processes

Kane Warrior +1
cs.CV 2026-05-11 reviewed

CutMix training induces local attention in early ViT layers
Inducing Spatial Locality in Vision Transformers through the Training Protocol

Eduardo Santiago Toledo +1
stat.ME 2026-05-11 reviewed

Predictive resampling yields exact Bayesian posteriors
Variational predictive resampling

Laura Battaglia +4
stat.ME 2026-05-11 reviewed

VPR with mean-field predictives matches exact posteriors
Variational predictive resampling

Laura Battaglia +4
cs.IT 2026-05-11 reviewed

Synthesize likelihoods to meet accuracy bounds with minimal prior deviation
Sensor Design for Accuracy-Bounded Estimation via Maximum-Entropy Likelihood Synthesis

Raktim Bhattacharya
cs.LG 2026-05-11 reviewed

Neural tilting of Lévy measures enables jump-preserving SDE inference
Variational Inference for L\'evy Process-Driven SDEs via Neural Tilting

Yaman Kindap +4
cs.LG 2026-05-11 reviewed

k-step policy gradients escape myopic traps in restricted MDPs
Revisiting Policy Gradients for Restricted Policy Classes: Escaping Myopic Local Optima with $k$-step Policy Gradients

Alex DeWeese +1
stat.ML 2026-05-11 reviewed

Transformer states converge uniformly to ODEs at rate O(1/L + 1/(L^{1/3} sqrt(H)))
Uniform Scaling Limits in AdamW-Trained Transformers

William Gibson +1
cs.AI 2026-05-11 reviewed

Reasoning helps LLM judges only on hard tasks
Reasoning Is Not Free: Robust Adaptive Cost-Efficient Routing for LLM-as-a-Judge

Wenbo Zhang +3
stat.ML 2026-05-11 reviewed

Linear networks store facts up to p log p = d²/2
Factual recall in linear associative memories: sharp asymptotics and mechanistic insights

Alessio Giorlandino +2
math.ST 2026-05-11 reviewed

Finite VC dimension enables finite-sample tests for distribution trade-offs
When Are Trade-Off Functions Testable from Finite Samples?

Kaining Shi +2
cs.LG 2026-05-11 reviewed

Tail extrapolation approximates best-of-N gradients from m much smaller than N
What should post-training optimize? A test-time scaling law perspective

Muheng Li +2
stat.ML 2026-05-11 reviewed

LASSO matches homogeneous threshold for mixed-quality sparse data
Price of Quality: Sufficient Conditions for Sparse Recovery using Mixed-Quality Data

Youssef Chaabouni +1
cs.LG 2026-05-11 reviewed

Natural policy gradient equals smoothed policy iteration
Natural Policy Gradient as Doubly Smoothed Policy Iteration: A Bellman-Operator Framework

Phalguni Nanda +1
cs.CL 2026-05-11 reviewed

LLM personas match human survey distributions on stable questions
When Can Digital Personas Reliably Approximate Human Survey Findings?

Mumin Jia +3
cs.LG 2026-05-11 reviewed

Divide-and-conquer causal discovery extends to latent variables
A Recursive Decomposition Framework for Causal Structure Learning in the Presence of Latent Variables

Zheng Li +5
stat.ML 2026-05-11 reviewed

Amortized networks speed up causal sensitivity bounds by orders of magnitude
Amortizing Causal Sensitivity Analysis via Prior Data-Fitted Networks

Emil Javurek +4
stat.ML 2026-05-11 reviewed

Bayesian linear solvers are special cases of affine PIMs
Affine Tracing: A New Paradigm for Probabilistic Linear Solvers

Disha Hegde +2
cs.CV 2026-05-11 reviewed

Confidence weights fuse modalities for long-tailed recognition
Simultaneous Long-tailed Recognition and Multi-modal Fusion for Highly Imbalanced Multi-modal Data

Heegeon Yoon +1
math.OC 2026-05-11 reviewed

Bound certifies any learned controller for unknown linear systems
A PAC-Bayes Approach for Controlling Unknown Linear Discrete-time Systems

Yujia Luo +3
math.OC 2026-05-11 reviewed

PAC-Bayes bound guarantees controller performance on unknown systems
A PAC-Bayes Approach for Controlling Unknown Linear Discrete-time Systems

Yujia Luo +3
cs.LG 2026-05-11 reviewed

Semi-simulated tests pick different winners than real data for treatment effects
Real vs. Semi-Simulated: Rethinking Evaluation for Treatment Effect Estimation

George Panagopoulos
stat.ME 2026-05-11 reviewed

Covariate-dependent level links low-fidelity quantiles to high-fidelity ones
Multi-Fidelity Quantile Regression

Yixiang Liu +1
stat.ML 2026-05-11 reviewed

Sharp jumps in feature overlap set optimal neural scaling laws
Sharp feature-learning transitions and Bayes-optimal neural scaling laws in extensive-width networks

Minh-Toan Nguyen +1
stat.ML 2026-05-11 reviewed

Mass lift certifies regret in guided diffusion optimization
Regret Analysis of Guided Diffusion for Black-Box Optimization over Structured Inputs

Masaki Adachi +3
stat.ML 2026-05-11 reviewed

Low-fidelity data yields kernels for high-fidelity PDE solving
Multifidelity Gaussian process regression for solving nonlinear partial differential equations

Fatima-Zahrae El-Boukkouri +2
stat.ML 2026-05-11 reviewed

Unified taxonomy clarifies ML uncertainty for physics
Uncertainty in Physics and AI: Taxonomy, Quantification, and Validation

Manuel Hau{\ss}mann +2
stat.ML 2026-05-11 reviewed

Expert losses cut MoE training time for time series
Fast Training of Mixture-of-Experts for Time Series Forecasting via Expert Loss Integration

Btissame El Mahtout +1
stat.ML 2026-05-11 reviewed

Test error in augmented random features depends only on data and augmentation moments
Characterizing the Generalization Error of Random Feature Regression with Arbitrary Data-Augmentation

Lucas Morisset +2
cs.LG 2026-05-11 reviewed

Anchored TS safely reduces regret using shifted offline data
Sample-Mean Anchored Thompson Sampling for Offline-to-Online Learning with Distribution Shift

Bochao Li +3
cs.LG 2026-05-11 reviewed

Median anchoring cuts regret in online bandits with shifted offline data
Sample-Mean Anchored Thompson Sampling for Offline-to-Online Learning with Distribution Shift

Bochao Li +3
stat.ML 2026-05-11 reviewed

Neural feature maps scale exact GP inference
Scalable Gaussian process inference via neural feature maps

Anthony Stephenson
cs.CV 2026-05-11 reviewed

Focal sets plus fuzzy logic tame uncertainty in hierarchical image labels
A neurosymbolic Approach with Epistemic Deep Learning for Hierarchical Image Classification

Ezel Kilicdere +2
cs.LG 2026-05-11 reviewed

Deeper Picard iterations cut truncation error without unbounded estimation error
Generalization Error Bounds for Picard-Type Operator Learning in Nonlinear Parabolic PDEs

Koichi Taniguchi +1
math.ST 2026-05-11 reviewed

GAN method estimates full causal distributions with minimax optimality
Extended Wasserstein-GAN Approach to Causal Distribution Learning: Density-Free Estimation and Minimax Optimality

Shu Tamano +1
cs.LG 2026-05-11 reviewed

Scaling rules transfer hyperparameters from small to large DenseAMs
Hyperparameter Transfer for Dense Associative Memories

Roi Holtzman +2
stat.ML 2026-05-11 reviewed

Cyclic LiNG models coarsen to identifiable low-dimensional DAGs
Coarsening Linear Non-Gaussian Causal Models with Cycles

Francisco Madaleno +2
stat.ML 2026-05-11 reviewed

Subsampled CLT turns PFN predictions into valid Thompson samples
PFN-TS: Thompson Sampling for Contextual Bandits via Prior-Data Fitted Networks

Yan Shuo Tan +5
cs.LG 2026-05-11 reviewed

Kaplan-Meier estimators give unbiased ARL and ADD for finite sequences
Accurate Evaluation of Quickest Changepoint Detectors via Non-parametric Survival Analysis

Taiki Miyagawa +1
cs.LG 2026-05-11 reviewed

Generative models show an innovation window before memorizing data
The two clocks and the innovation window: When and how generative models learn rules

Binxu Wang +2
stat.ML 2026-05-11 reviewed

Wasserstein projection gives optimal private sampling
Differentially Private Sampling from Distributions via Wasserstein Projection

Shokichi Takakura +2
stat.ML 2026-05-11 reviewed

Federated LLMs keep explicit consistency and coverage under bandwidth budgets
Federated Language Models Under Bandwidth Budgets: Distillation Rates and Conformal Coverage

Prasanjit Dubey +1
cs.LG 2026-05-11 reviewed

Order-gap measure gives stopping rule for adaptive learning
Consolidation-Expansion Operator Mechanics:A Unified Framework for Adaptive Learning

Debashis Guha
cs.LG 2026-05-11 reviewed

Order-gap tracks distance to settled state in learning systems
Consolidation-Expansion Operator Mechanics:A Unified Framework for Adaptive Learning

Debashis Guha
stat.ML 2026-05-11 reviewed

Multicalibration corrected without clean labels using contamination matrices
Unified Approach for Weakly Supervised Multicalibration

Futoshi Futami +1
stat.ML 2026-05-11 reviewed

Rectified AI laws cut bias in Bayesian priors from limited data
Supercharging Bayesian Inference with Reliable AI-Informed Priors

Jongwoo Choi +1
cs.LG 2026-05-10 reviewed

Kernel regression error bounds cover non-Gaussian noise
On Uniform Error Bounds for Kernel Regression under Non-Gaussian Noise

Johannes Teutsch +4