archive
Every paper Pith has read. Search by title, abstract, or pith.
2684 papers in stat.ML · page 6
-
Product kernels recover saturation and multiple descent in high-dim KRR
Large Dimensional Kernel Ridge Regression: Extending to Product Kernels
-
Eigenvector alignment in kernels drives generalization in regression
On Kernel Eigen-alignments of KRR: Reconstruction and Generalization
-
Correlated models miscalibrate under Brier aggregation
When Individually Calibrated Models Become Collectively Miscalibrated
-
NN radii converge almost surely under polynomial mixing
Nearest-Neighbor Radii under Dependent Sampling
-
LLM priors guide source selection in cold-start adaptation
Language-Induced Priors for Domain Adaptation
-
Mixed gradients keep RL unbiased in hybrid discrete-continuous spaces
Policy Optimization in Hybrid Discrete-Continuous Action Spaces via Mixed Gradients
-
Target penalty induces bounded weights even with disjoint supports
TILT: Target-induced loss tilting under covariate shift
-
Moment matching lets diffusion sampling skip neural training
Training-Free Generative Sampling via Moment-Matched Score Smoothing
-
Pooled conformal calibration forces coverage or size distortion
On the Burden of Achieving Fairness in Conformal Prediction
-
Conformal prediction fairness forces a coverage-size trade-off
On the Burden of Achieving Fairness in Conformal Prediction
-
MSSP restores learning-rate transfer in scaled Mixture-of-Experts
How to Scale Mixture-of-Experts: From muP to the Maximally Scale-Stable Parameterization
-
Score matching yields polynomial sample bounds for polynomial families
Finite Sample Bounds for Learning with Score Matching
-
Mean shift particles approximate integrals from unnormalized densities
To discretize continually: Mean shift interacting particle systems for Bayesian inference
-
Conformal method bounds confident errors in CoT reasoning
Pause and Reflect: Conformal Aggregation for Chain-of-Thought Reasoning
-
Ising networks store continuous signals near 0.5 capacity
Finite-size scaling of hetero-associative retrieval in continuous-signal-driven Ising spin systems
-
TabPFN-3 tops tabular benchmarks in a single forward pass
TabPFN-3: Technical Report
-
Classical algorithm samples neural lottery tickets in poly(D) time
Winning Lottery Tickets in Neural Networks via a Quantum-Inspired Classical Algorithm
-
Valiant learnability equals poly-size query compression
What is Learnable in Valiant's Theory of the Learnable?
-
Hybrid VAE isolates scanner effects in brain connectomes
Unsupervised learning of acquisition variability in structural connectomes via hybrid latent space modeling
-
Entropic RL policy identification matches lower bound
Tight Sample Complexity Bounds for Entropic Best Policy Identification
-
Reasoning models sample tree languages exactly with log n memory
A Hierarchical Language Model with Predictable Scaling Laws and Provable Benefits of Reasoning
-
Marginal bridges cut denoising error in flow language models
Sampling from Flow Language Models via Marginal-Conditioned Bridges
-
Package turns anomaly scores into calibrated p-values
Conformal Anomaly Detection in Python: Moving Beyond Heuristic Thresholds with 'nonconform'
-
Single-loop actor-critic hits Õ(ε^{-2}) sample rate
Achieving $\epsilon^{-2}$ Sample Complexity for Single-Loop Actor-Critic under Minimal Assumptions
-
Deep learning reduces to layerwise low-degree spectral filtering
Deep Learning as Neural Low-Degree Filtering: A Spectral Theory of Hierarchical Feature Learning
-
Two auxiliary environments identify any nonlinear causal graph
Causal Learning with the Invariance Principle
-
Adaptive internal preprocessing beats external searches on 42 of 57 NIRS datasets
Reframing preprocessing selection as model-internal calibration in near-infrared spectroscopy: A large-scale benchmark of operator-adaptive PLS and Ridge models
-
NIR calibration absorbs preprocessing into the model
Reframing preprocessing selection as model-internal calibration in near-infrared spectroscopy: A large-scale benchmark of operator-adaptive PLS and Ridge models
-
Reused diffusion latents incur error from subspace misalignment
On the Limits of Latent Reuse in Diffusion Models
-
Rescaled stepsizes remove bias in async SGD
Rescaled Asynchronous SGD: Optimal Distributed Optimization under Data and System Heterogeneity
-
Trajectory augmentation enables offline RL from limited suboptimal data
Trajectory-Level Data Augmentation for Offline Reinforcement Learning
-
Flow matching conditioning is kernel smoothing
Support-Conditioned Flow Matching Is Kernel Smoothing
2 Piths -
Perturbation cuts online FDR regret from linear to sqrt(T)
A Regret Perspective on Online Multiple Testing
-
Delight gate decides when to explore and reduces regret growth
Delightful Exploration
-
Learned continuous perturbations boost LLM extrapolation to new domains
Learning Perturbations to Extrapolate Your LLM
-
Activation split removes dequant bottleneck in LLM inference
Multi-Scale Dequant: Eliminating Dequantization Bottleneck via Activation Decomposition for Efficient LLM Inference
-
High-rank PINNs generalize despite differential operators
Unified generalization analysis for physics informed neural networks
-
Change-point sample needs depend on both jumps and positions
The Sample Complexity of Multiple Change Point Identification under Bandit Feedback
-
One template inequality unifies data-dependent generalization bounds
A Survey on Data-Dependent Worst-Case Generalization Bounds
-
Entropy rises with missing context in LLMs
LLMs as Implicit Imputers: Uncertainty Should Scale with Missing Information
-
The paper proposes a likelihood-free Bayesian filtering method using coupling-informed…
Coupling-Informed Transport Maps for Bayesian Filtering in Nonlinear Dynamical Systems
-
Kernels on parameters deliver bounds for nonlinear BO models
Kernel-based guarantees for nonlinear parametric models in Bayesian optimization
-
GP construction keeps mean identical across repetitions with smooth variation
Generative Modeling of Approximately Periodic Time Series by a Posterior-Weighted Gaussian Process
-
Hallucination limits in AI imaging fixed by forward model alone
On Hallucinations in Inverse Problems: Fundamental Limits and Provable Assessment Methods
-
Neural nets learn time series clusters from simulations
Amortized Neural Clustering of Time Series based on Statistical Features
-
Wavelet DPPs deliver better minibatch variance reduction
State-of-art minibatches via novel DPP kernels: discretization, wavelets, and rough objectives
-
Covariance estimate lifts few-step diffusion sampling
Covariance-aware sampling for Diffusion Models
-
Pre-trained net selects kernels for high-dim density estimates
Adaptive Kernel Density Estimation with Pre-training
-
Adaptive sampling fixes bias in fast FP8 RL rollouts
AIS: Adaptive Importance Sampling for Quantized RL
-
Coreset surrogate equals Wasserstein gap in flow matching
Coreset-Induced Conditional Velocity Flow Matching