archive
Every paper Pith has read. Search by title, abstract, or pith.
2684 papers in stat.ML · page 11
-
Fixed participation beats Poisson for DP-SGD privacy
Less Random, More Private: What is the Optimal Subsampling Scheme for DP-SGD?
-
Neural method delivers valid bounds on individual causal effects
Causal EpiNets: Precision-corrected Bounds on Individual Treatment Effects using Epistemic Neural Networks
-
Functional priors yield accurate posteriors in PINN inversion
Functional-prior-based approaches to Bayesian PDE-constrained inversion using physics-informed neural networks
-
Functional priors integrate into Bayesian PINN inversion
Functional-prior-based approaches to Bayesian PDE-constrained inversion using physics-informed neural networks
-
Matrix factorization speeds LLM evaluation by orders of magnitude
An Interpretable and Scalable Framework for Evaluating Large Language Models
-
Quantized model swaps give agents global FDR control in novelty detection
Decentralized Conformal Novelty Detection via Quantized Model Exchange
-
Latent model separates confounding for nonlinear IV with rich covariates
BGM-IV: an AI-powered Bayesian generative modeling approach for instrumental variable analysis
-
The paper introduces a hypothesis testing framework for adaptive auditing of AI systems…
Adaptive auditing of AI systems with anytime-valid guarantees
-
Cost-limited experiments chosen to maximize causal bound tightening
Optimal Experiments for Partial Causal Effect Identification
-
Safety heightens sensitivity of task-to-controller maps
Why Does Agentic Safety Fail to Generalize Across Tasks?
-
Response times identify average preferences from single anonymous choices
Response Time Enhances Alignment with Heterogeneous Preferences
-
Optimal transport localizes causal handles in neural nets
PLOT: Progressive Localization via Optimal Transport in Neural Causal Abstraction
-
Two sampling rules bound regret in f-divergence RLHF to O(log T)
$f$-Divergence Regularized RLHF: Two Tales of Sampling and Unified Analyses
-
Differentiable relaxation recovers latent partial orders from linear traces
A Differentiable Bayesian Relaxation for Latent Partial-Order Inference
-
ABGD recovers piecewise linear models with near-minimax samples
Locally Near Optimal Piecewise Linear Regression in High Dimensions via Difference of Max-Affine Functions
-
Shared calibration reverses LLM judge comparisons
Bias and Uncertainty in LLM-as-a-Judge Estimation
-
Complexity penalty lets MMD tests optimize kernels without grids
Kernel Selection is Model Selection: A Unified Complexity-Penalized Approach for MMD Two-Sample Tests
-
Neural operator approximates conditioning for any joint density
One Operator for Many Densities: Amortized Approximation of Conditioning by Neural Operators
-
Neural operators approximate conditioning for any joint density
One Operator for Many Densities: Amortized Approximation of Conditioning by Neural Operators
-
Multi-stage smoothing recovers evolving network edges
Nonparametric estimation of time-varying network connections by multi-stage smoothing
-
Attention weights follow top eigenvector of position matrix to maximize signal recovery
How Does Attention Help? Insights from Random Matrices on Signal Recovery from Sequence Models
-
Rod flow tracks Adam at edge of stability better than stable flows
A Rod Flow Model for Adam at the Edge of Stability
-
Online calibration tracks gradual drifts and detects abrupt shifts
Online Bayesian Calibration under Gradual and Abrupt System Changes
-
Attention sinks trace to first-token dimension disparity
The Structural Origin of Attention Sink: Variance Discrepancy, Super Neurons, and Dimension Disparity
-
Transformers run normalized gradient descent inside each layer for in-context logistic fit
Transformers Efficiently Perform In-Context Logistic Regression via Normalized Gradient Descent
-
Adaptive covariate selection preserves RCT validity under budget limits
DARTS: Targeting Prognostic Covariates in Budget-Constrained Sequential Experiments
-
Geometry-aware correction refines SABR volatility formula
A Geometry-Aware Residual Correction of Hagan's SABR Implied Volatility Formula
-
Adaptive network targeting outperforms static methods via Ising-RL
Dynamic Treatment on Networks
-
Hedging memory scales cuts forecast error 35 percent in shifting regimes
Hedging Memory Horizons for Non-Stationary Prediction via Online Aggregation
-
Proxy inferences calibrated by random effects from past domains
Estimate Level Adjustment For Inference With Proxies Under Random Distribution Shifts
-
Threshold post-processing meets risk constraints with minimal baseline change
Risk-Controlled Post-Processing of Decision Policies
-
Q-MMR delivers dimension-free off-policy bounds from Q-realizability alone
Q-MMR: Off-Policy Evaluation via Recursive Reweighting and Moment Matching
-
Reweighting yields dimension-free bounds for off-policy evaluation
Q-MMR: Off-Policy Evaluation via Recursive Reweighting and Moment Matching
-
Anchored LSTMs improve longevity forecasts where linear models fail
Neural-Actuarial Longevity Forecasting: Anchoring LSTMs for Explainable Risk Management
-
Decoupled PFNs identify epistemic-aleatoric split from synthetic priors
Decoupled PFNs: Identifiable Epistemic-Aleatoric Decomposition via Structured Synthetic Priors
-
Neyman score dictates balancing in debiased machine learning
Covariate Balancing and Riesz Regression Should Be Guided by the Neyman Orthogonal Score in Debiased Machine Learning
-
DQN finite-sample bounds work under temporal mixing with rate cost
Beyond the Independence Assumption: Finite-Sample Guarantees for Deep Q-Learning under $\tau$-Mixing
-
Class variance dictates learning order in diffusion models
The Interplay of Data Structure and Imbalance in the Learning Dynamics of Diffusion Models
-
Autoregressive models trained on incomplete data outperform imputation baselines
Order-Agnostic Autoregressive Modelling with Missing Data
-
Loop persistence spikes when neural nets grok
Topological Signatures of Grokking
-
Jacobi prior adds closed-form Bayesian step to 9.5 MB edge disease detector
TinyBayes: Closed-Form Bayesian Inference via Jacobi Prior for Real-Time Image Classification on Edge Devices
-
Recurrent switching systems gain identifiability proof and exact flow estimator
End-to-End Identifiable and Consistent Recurrent Switching Dynamical Systems
-
Attributions decompose into meta-attributions via Shapley games
Attributions All the Way Down? The Metagame of Interpretability
-
Multimodal model outperforms on imbalanced semi-supervised tasks
Multimodal Deep Generative Model for Semi-Supervised Learning under Class Imbalance
-
Smoothed quantile nets reach minimax rates
ConquerNet: Convolution-Smoothed Quantile ReLU Neural Networks with Minimax Guarantees
-
Direct volume minimization gives conditional quantiles
Super-Level-Set Regression: Conditional Quantiles via Volume Minimization
-
Trimming conformal calibration helps only under score neutrality
When Does Trimming Help Conformal Prediction? A Retained-Law Diagnostic under Calibration Contamination
-
Global-UCB bounds regret linearly by pre-training in open bandits
Bandit Learning in General Open Multi-agent Systems
-
Bi-Lipschitz flows give L1-universal approximation of densities
Expressivity of Bi-Lipschitz Normalizing Flows: A Score-Based Diffusion Perspective
-
Floating-point limits trigger slingshot loss spikes
Grokking or Glitching? How Low-Precision Drives Slingshot Loss Spikes