archive
Every paper Pith has read. Search by title, abstract, or pith.
2684 papers in stat.ML · page 10
-
Semantic sampling yields unbiased calibration metric for open QA
A Semantic-Sampling Framework for Evaluating Calibration in Open-Ended Question Answering
-
AM-PPI narrows CIs 10-40% by routing cases to right predictor
Active Multiple-Prediction-Powered Inference
-
Queryable LoRA routes shared low-rank atoms by network state
Queryable LoRA: Instruction-Regularized Routing Over Shared Low-Rank Update Atoms
-
Log d time recovers latent Hawkes networks
On Observation Time for Recovering Latent Hawkes Networks
-
Deep Sets require embedding dimension linear in set size for universality
Embedding Dimension Lower Bounds for Universality of Deep Sets and Janossy Pooling
-
Newton method converges exponentially for infinite-width neural nets
Convergence Analysis of Newton's Method for Neural Networks in the Overparameterized Limit
-
Newton method reaches zero loss exponentially fast in wide neural nets
Convergence Analysis of Newton's Method for Neural Networks in the Overparameterized Limit
-
Bounded Gaussian surface area allows non-negative L1 approximations
A Note on Non-Negative $L_1$-Approximating Polynomials
-
Rebiasing debiased estimates shortens intervals with valid coverage
Empirical Bayes Rebiasing
-
Deep learning infers red-giant seismic parameters from short TESS data
Inferring Asteroseismic Parameters from Short Observations Using Deep Learning: Application to TESS and K2 Red Giants
-
Test pinpoints locations where treatments alter outcome distributions
Semiparametric Efficient Test for Interpretable Distributional Treatment Effects
-
This paper proposes shifting LLM judges from full substitutes to auxiliary tools in a…
Augmenting Human Evaluation with LLM Judges: How Many Human Reviews Do You Need?
-
Penalty methods reach ε-KKT points for bilevel minimax problems in Õ(ε^{-4}) steps
Penalty-Based First-Order Methods for Bilevel Optimization with Minimax and Constrained Lower-Level Problems
-
Encoder trained on pairs scales inference to sets of thousands
It Just Takes Two: Scaling Amortized Inference to Large Sets
-
Bayes predictives make confidence sequences asymptotically log-optimal
Asymptotically Log-Optimal Bayes-Assisted Confidence Sequences for Bounded Means
-
Bayes-assisted sequences match oracle efficiency for bounded means
Asymptotically Log-Optimal Bayes-Assisted Confidence Sequences for Bounded Means
-
Single gradient flow solves inverse problems at low cost
Consistency Regularised Gradient Flows for Inverse Problems
-
Target correction equates online kernel regression to offline
Characterizing and Correcting Effective Target Shift in Online Learning
-
Chance-level black-box classification decays exponentially with more queries
Black-box model classification under the discriminative factorization
-
MuP keeps spectral outliers width-independent in deep linear nets
Spectral Dynamics in Deep Networks: Feature Learning, Outlier Escape, and Learning Rate Transfer
-
Outlier modes in deep network spectra grow consistently across widths
Spectral Dynamics in Deep Networks: Feature Learning, Outlier Escape, and Learning Rate Transfer
-
Attention beats Fourier for PDEs on irregular shapes
When Attention Beats Fourier: Multi-Scale Transformers for PDE Solving on Irregular Domains
-
EM convergence governed by missing-information operator
Expectation-Maximization as a Spectrally Governed Relaxation Flow
-
POETS performs KL-regularized Thompson sampling via LLM policy ensembles
POETS: Uncertainty-Aware LLM Optimization via Compute-Efficient Policy Ensembles
-
Flow matching on raw counts beats baselines with fewer parameters
Flow Matching for Count Data
-
Learned topology maximizes Fisher information in lensing maps
TopoFisher: Learning Topological Summary Statistics by Maximizing Fisher Information
-
Counterfactuals generated as deconfounding flows from observations
Debiased Counterfactual Generation via Flow Matching from Observations
-
Prefix consistency weights CoT votes to match accuracy at 4.6x fewer tokens
Reliable Chain-of-Thought via Prefix Consistency
-
FHDMs match minimax rates for spherical data
Statistical Convergence of Spherical First Hitting Diffusion Models
-
New bound makes contrastive learning scale with class count
A Refined Generalization Analysis for Extreme Multi-class Supervised Contrastive Representation Learning
-
Contrastive learning bounds scale only with number of classes
A Refined Generalization Analysis for Extreme Multi-class Supervised Contrastive Representation Learning
-
Energy minimization yields weight-tied layers matching Transformer baselines
Revisiting Transformer Layer Parameterization Through Causal Energy Minimization
-
Bayesian optimization discovers tasks with only log regret overhead
Open-Ended Task Discovery via Bayesian Optimization
-
-
Masked-position latent prediction beats MLM on protein tasks
ProteinJEPA: Latent prediction complements protein language models
-
-
Trained Transformers admit spectrum-adaptive generalization bounds
Spectrum-Adaptive Generalization Bounds for Trained Deep Transformers
-
Energy subtraction on paired elements recovers signed OTA aggregates
Resource-Element Energy Difference for Noncoherent Over-the-Air Federated Learning
-
Energy difference on two resources replaces CSI for wireless federated learning
Resource-Element Energy Difference for Noncoherent Over-the-Air Federated Learning
-
Calibrated noise on single samples yields unbiased private gradients
Modulated learning for private and distributed regression with just a single sample per client device
-
Bernstein bonus improves kernel RL regret bound
Improved Model-based Reinforcement Learning with Smooth Kernels
-
New algorithm matches lower bounds on cost in reward-constrained bandits
Cost-Ordered Feasibility for Multi-Armed Bandits with Cost Subsidy
-
Token overlaps perturb template rules but graph geometry can preserve margins
When Symbol Names Should Not Matter: A Logistic Theory of Fresh-Symbol Classification
-
Learned rule extends finite cluster trees to arbitrary depth
Classification Fields: Arbitrarily Fine Recursive Hierarchical Clustering From Few Examples
-
Bandit policy logs regret on upper-quantile targets
Conformal-Style Quantile Analyses for Stochastic Bandits
-
MLE attains sub-Gaussian tails and entropic normality
Sub-Gaussian Concentration and Entropic Normality of the Maximum Likelihood Estimator
-
Poisson-Moreau drift yields near-optimal almost sure rates for Markovian SA
Almost Sure Convergence Rates of Stochastic Approximation and Reinforcement Learning via a Poisson-Moreau Drift
-
Diffusion policies fix exploration limits in multi-agent RL
Decentralized Diffusion Policy Learning for Enhanced Exploration in Cooperative Multi-agent Reinforcement Learning
-
Averaging trajectory errors calibrates conformal sets for diffusion models
TRACE: Transport Alignment Conformal Prediction via Diffusion and Flow Matching Models
-
Fixed neural networks with definable layers have finite PAC sample complexity
Every Feedforward Neural Network Definable in an o-Minimal Structure Has Finite Sample Complexity