archive
Every paper Pith has read. Search by title, abstract, or pith.
2685 papers in stat.ML · page 15
-
Spectral sparsifiers keep GNN embedding geometry stable
Spectral Graph Sparsification Preserves Representation Geometry in Graph Neural Networks
-
TopoNTK detects topology that graph kernels miss
Topological Neural Tangent Kernel
-
Diffusion operator gives closed-form affinities for neural layers
Diffusion Operator Geometry of Feedforward Representations
-
Virtual particles enable MLE for mean-field limits from one trajectory
Recursive Maximum Likelihood Estimation for Interacting Particle Systems using Virtual Particles
-
Disease perturbs biomarker modes for precise prognosis
Disease Is a Spectral Perturbation
-
Agentic AI orchestration should be Bayesian
Position: agentic AI orchestration should be Bayes-consistent
-
Randomized subspaces accelerate Nesterov gradient methods
Randomized Subspace Nesterov Accelerated Gradient
-
Volatility updates in HGF now avoid negative precision
Robust volatility updates for Hierarchical Gaussian Filtering
-
Decentralized MCMC converges in Wasserstein distance for constrained distributions
Decentralized Proximal Stochastic Gradient Langevin Dynamics
-
AI personas yield closed-form Bayesian updates for adaptive querying
Adaptive Querying with AI Persona Priors
-
Q-learning with multipattern approximation bounds regret at O(H² N^H √K)
Reinforcement Learning with Markov Risk Measures and Multipattern Risk Approximation
-
Regularized Newton boosting converges globally at O(1/k²)
Gradient Regularized Newton Boosting Trees with Global Convergence
-
Batch norm extended to complex domain neural networks
Batch Normalization for Neural Networks on Complex Domains
-
Predictive Bayesian credible sets can have near-zero coverage
Concentration and Calibration in Predictive Bayesian Inference
-
Spatial evidence gates Bayesian CP to fix coverage and bloat
Optimal Spatio-Temporal Decoupling for Bayesian Conformal Prediction
-
M-CaStLe recovers causal structure in multivariate space-time grids
M-CaStLe: Uncovering Local Causal Structures in Multivariate Space-Time Gridded Data
-
Taylor mapping replaces kernel in point-set registration
Structured Analytic Coherent Point Drift for Non-Rigid Point Set Registration
-
Analytic mappings shrink CPD deformation to order-dependent size
Structured Analytic Coherent Point Drift for Non-Rigid Point Set Registration
-
Uniform penalty prevents RLVR collapse to few correct answers
Uniform-Correct Policy Optimization: Breaking RLVR's Indifference to Diversity
-
Gauge momentum stabilizes PDE parametrizations
A Dirac-Frenkel-Onsager principle: Instantaneous residual minimization with gauge momentum for nonlinear parametrizations of PDE solutions
-
Constant informational speed lifts graph diffusion quality
Information-geometric adaptive sampling for graph diffusion
-
Recursive partitioning enables linear-time Bayesian optimization
Bayesian Optimization in Linear Time
-
Adjoint methods show finite gradient variance in diffusion fine-tuning
A unified perspective on fine-tuning and sampling with diffusion and flow models
-
Soft segmentation learns decision weights with lowest regret
OTSS: Output-Targeted Soft Segmentation for Contextual Decision-Weight Learning
-
SHIFT cuts localized contamination RMSE for dose-response curves
SHIFT: Robust Double Machine Learning for Average Dose-Response Functions under Heavy-Tailed Contamination
-
Covariance-aware penalties boost neural net accuracy on correlated inputs
Adaptive Norm-Based Regularization for Neural Networks
-
Wasserstein regret optimization yields exact water-filling policy
Wasserstein Distributionally Robust Regret Optimization for Reinforcement Learning from Human Feedback
-
DRRO solves RLHF regret exactly via water-filling under l1 sets
Wasserstein Distributionally Robust Regret Optimization for Reinforcement Learning from Human Feedback
-
Latent diffusion plus conformal bands tops load imputation benchmarks
SPLICE: Latent Diffusion over JEPA Embeddings for Conformal Time-Series Inpainting
-
Matchgate evolutions yield provable quantum Gaussian processes
Provable and scalable quantum Gaussian processes for quantum learning
-
Sequential GPs enable streaming inference in signal processing
Sequential Inference for Gaussian Processes: A Signal Processing Perspective
-
Kernel smoothing gives accurate value estimates for LLM reasoning
Kernelized Advantage Estimation: From Nonparametric Statistics to LLM Reasoning
-
Kernel smoothing enables accurate LLM value estimates from few traces
Kernelized Advantage Estimation: From Nonparametric Statistics to LLM Reasoning
-
Mixture of experts sharpens prediction-powered inference
Prediction-powered Inference by Mixture of Experts
-
Decoupled descent forces train error to match test error
Decoupled Descent: Exact Test Error Tracking Via Approximate Message Passing
-
Smooth losses achieve linear consistency rates
Linear-Core Surrogates: Smooth Loss Functions with Linear Rates for Classification and Structured Prediction
-
Standard DPO losses are inconsistent for neural networks
Mind the Gap: Structure-Aware Consistency in Preference Learning
-
MILD algorithm corrects deferral bias caused by imbalanced experts
Optimized Deferral for Imbalanced Settings
-
R packages unify forecast reconciliation across three frameworks
FoReco and FoRecoML: A Unified Toolbox for Forecast Reconciliation in R
-
Bayesian one-pass algorithm attains optimal posterior convergence
Bayesian online learning in the one-pass regime: Frequentist validity and uncertainty quantification
-
Bayesian X-learner recovers calibrated CATE posteriors under heavy tails
Bayesian X-Learner: Calibrated Posterior Inference for Heterogeneous Treatment Effects under Heavy-Tailed Outcomes
-
Tree discretization plus ILP matching cuts bias in causal estimates
A Novel Computational Framework for Causal Inference: Tree-Based Discretization with ILP-Based Matching
-
Trees plus ILP cut bias and time in causal matching
A Novel Computational Framework for Causal Inference: Tree-Based Discretization with ILP-Based Matching
-
Copula transform lifts ensemble to 0.96 accuracy on groundwater pollution
Smart Ensemble Learning Framework for Predicting Groundwater Heavy Metal Pollution
-
Neural network picks regression variables using OLS estimates
Linear Models, Variable Selection, Artificial Intelligence
-
νGPT transfers learning rates across width and depth
Learning Rate Transfer in Normalized Transformers
-
HyCNNs approximate quadratics with exponentially fewer parameters
Hyper Input Convex Neural Networks for Shape Constrained Learning and Optimal Transport
-
Prior change removes ln ln T from Squint bound
A Note on How to Remove the $\ln\ln T$ Term from the Squint Bound
-
Revenue learners converge for any distribution but at arbitrarily slow rates
On the Learning Curves of Revenue Maximization
-
Transformers converge to noise-synchronized particle systems
Stochastic Scaling Limits and Synchronization by Noise in Deep Transformer Models