archive
Every paper Pith has read. Search by title, abstract, or pith.
2684 papers in stat.ML · page 3
-
Semi-parametric BART separates covariates from epigenetic trees
Semi-Parametric Bayesian Additive Regression Trees for Risk Prediction with High-Dimensional Epigenetic Signatures and Low-Dimensional Covariates
-
Grid sketch achieves optimal Wasserstein runtime for smooth laws
Optimizing Computational-Statistical Runtime for Wasserstein Distance Estimation
-
Soft-log transform lets flow matching handle heavy tails
Tail Annealing for Heavy-Tailed Flow Matching
-
Gated estimator cuts manifold density error by 22-36%
Variance-Reduced Manifold Sampling via Polynomial-Maximization Density Estimation
-
Domain cuts let neural operators handle PDE discontinuities
Smooth Piecewise Cutting for Neural Operator to Handle Discontinuities and Sharp Transitions
-
Benchmark separates ML models on flux extrapolation via tail errors
FLUXtrapolation: A benchmark on extrapolating ecosystem fluxes
-
Laplace diffusion generates long forecasts for irregular time series
Latent Laplace Diffusion for Irregular Multivariate Time Series
-
Variance-aware regret bound proven optimal for logistic MDPs
Minimax Optimal Variance-Aware Regret Bounds for Multinomial Logistic MDPs
-
Benchmark shows attention models scale better than RNNs on sequences
CogScale: Scalable Benchmark for Sequence Processing
-
Diffusion copula turns simultaneous crashes into expected events
Probabilistic Multivariate Time Series Forecasting with Diffusion Copulas
-
Federated stochastic approximation gets explicit Gaussian error bounds
Gaussian Approximation and Multiplier Bootstrap for Federated Linear Stochastic Approximation
-
MiMuon reaches O(1/N) generalization bound for matrix models
MiMuon: Mixed Muon Optimizer with Improved Generalization for Large Models
-
Lévy B-spline posterior contracts near minimax rates in Besov spaces
Posterior Contraction of L\'evy Adaptive B-spline Regression in Besov Spaces
-
Order-book no-trades yield square-root regret in market making
Online Market Making and the Value of Observing the Order Book
-
Density ratios enable adjustable post-hoc deferral
Density-Ratio Losses for Post-Hoc Learning to Defer
-
Tweedie formulae now cover non-Gaussian diffusions
Tweedie's Formulae and Diffusion Generative Models Beyond Gaussian
-
Benchmark labels hallucinations via explicit reference worlds
HalluWorld: A Controlled Benchmark for Hallucination via Reference World Models
5 Piths -
Protein Thoughts ranks true binders at mean position 11.2
Protein Thoughts: Interpretable Reasoning with Tree of Thoughts and Embedding-Space Flow Matching for Protein-Protein Interaction Discovery
-
Method clusters subjects and learns their distinct causal graphs
A Unified Framework for Structure-Aware Clustering and Heterogeneous Causal Graph Learning
-
Factor-augmented SGD converges with streaming high-dimensional data
Factor Augmented High-Dimensional SGD
-
Trajectory selection beats sampling in delayed disambiguation
EviTrack: Selection over Sampling for Delayed Disambiguation
-
Regime gate improves time series forecast accuracy under shifts
DeRegiME: Deep Regime Mixtures for Probabilistic Forecasting under Distribution Shift
-
RL on All of Us data prescribes steadier higher daily steps
Precision Physical Activity Prescription via Reinforcement Learning for Functional Actions
-
Thermodynamic bound sets optimal dataset size for linear regression
The Thermodynamic Costs of Simple Linear Regression
-
Multi-head attention error falls as subspaces decorrelate
Multi-Head Attention as Ensemble Nadaraya-Watson Estimation: Variance Reduction, Decorrelation, and Optimal Head Diversity
-
Higher-order Langevin dynamics reduce memorization in diffusion models
Reducing Diffusion Model Memorization with Higher Order Langevin Dynamics
-
Wrapper gives pathwise risk control for updating LLMs
Conformal Selective Acting: Anytime-Valid Risk Control for RLVR-Trained LLMs
4 Piths -
Total capacity of stationary physical systems predicts ML performance
Information Processing Capacity of Stationary Physical Systems: Theory, Data-efficient Estimation Methods, and Photonic Demonstration
-
Total IPC of stationary systems bounds to readout count and predicts ML results
Information Processing Capacity of Stationary Physical Systems: Theory, Data-efficient Estimation Methods, and Photonic Demonstration
-
Low-rank bandits recover drifting subspaces from scalar rewards
Catching a Moving Subspace: Low-Rank Bandits Beyond Stationarity
4 Piths -
Dual-channel networks select tensor structures with finite-sample guarantees
Dual-Channel Tensor Neural Networks: Finite-Sample Theory and Conformal Structure Selection
-
Greedy method learns optimal integer clinical risk scores directly
Learning Interpretable Point-Based Clinical Risk Scores via Direct Optimization
-
ScheduleFree+ beats WSD schedules on long LLM pretraining
ScheduleFree+: Scaling Learning-Rate-Free & Schedule-Free Learning to Large Language Models
-
Learned multipliers achieve optimal Theta(s/sqrt(N)) rate
Provably Data-driven Lagrangian Relaxation for Mixed Integer Linear Programming
4 Piths -
Beta law tracks conformal coverage under dependence
Conformal Prediction via Transported Beta Laws
-
Transformer model lowers earnings forecast error by 32 percent at ten years
SAGA: A Sequence-Adaptive Generative Architecture for Multi-Horizon Probabilistic Forecasting with Adaptive Temporal Conformal Prediction
-
Categorical confounder makes causal effects identifiable from proxies or multiple tests
Causal Inference with Categorical Unobserved Confounder via Mixture Learning
-
Girsanov path weights recover exact particle SMC for diffusion guidance
SURGE: Approximation and Training Free Particle Filter for Diffusion Surrogate
-
AdaGrad converges under heavy-tailed noise without knowing the tail index
Can Adaptive Gradient Methods Converge under Heavy-Tailed Noise? A Case Study of AdaGrad
-
FedNewton matches SGD accuracy with fewer rounds under privacy
Statistical Limits and Efficient Algorithms for Differentially Private Federated Learning
-
Weighted DAG aggregation stabilizes causal discovery
Stable Causal Discovery via Directed Acyclic Graph Aggregation
-
Embeddings let federated clients achieve centralized Bayesian uncertainty
Federated Martingale Posterior Samping
-
Manifold probe reveals how models encode time and space
Probing for Representation Manifolds in Superposition
-
Continuous diffusion scales to 20x compute gap of autoregressive models
Continuous Diffusion Scales Competitively with Discrete Diffusion for Language
-
Flow models gain per-sample confidence at standard sampling cost
Flowing with Confidence
-
Shallow ReLU^s networks beat random features below critical p
Shallow ReLU$^s$ Networks in $L^p$-Type and Sobolev Spaces: Approximation and Path-Norm Controlled Generalization
-
Path-norm ReLU^s nets match minimax regression rates
Shallow ReLU$^s$ Networks in $L^p$-Type and Sobolev Spaces: Approximation and Path-Norm Controlled Generalization
-
Path-norm ReLU nets hit minimax rates in regression
Shallow ReLU$^s$ Networks in $L^p$-Type and Sobolev Spaces: Approximation and Path-Norm Controlled Generalization
-
Markov Chain Decoders Fix Heavy-Tail Limits in VAEs
Markov Chain Decoders Overcome the Heavy-Tail Limitations of Lipschitz Generative Models
-
Closed-form policy optimizes allocation in censored survival trials
Adaptive Experimentation for Censored Survival Outcomes