archive
Every paper Pith has read. Search by title, abstract, or pith.
14903 papers in cs.LG · page 14
-
Tighter quadratic bounds cut conservatism in neural net reachability
Quadratic Characterizations for Reachability Analysis of Neural Networks
-
A single predictor transfers oracle hyperparameter labels from variational denoisers to…
Oracle Supervision Transfers for Hyperparameter Prediction in Model-Based Image Denoising
-
Trained reflectors improve language agents on new tasks
Training Language Agents to Learn from Experience
-
Code gen picks winner by clustering behaviors on auto-generated inputs
Code Generation by Differential Test Time Scaling
-
Classifier uncertainty narrows conformal intervals by 39% for confident cases
CASCADE Conformal Prediction: Uncertainty-Adaptive Prediction Intervals for Two-Stage Clinical Decision Support
-
Spectral memory branch lifts DP-SGD accuracy on CIFAR
SMA-DP: Spectral Memory-Aware Differential Privacy for Deep Learning
-
Linear probes on frozen LLMs forecast time series without supervision
LLM Pretraining Shapes a Generalizable Manifold: Insights into Cross-Modal Transfer to Time Series
-
VLMs rearrange visible objects at 53-97% but fail occlusion at 6-45%
Do Vision--Language Models Understand 3D Scenes or Just Catalogue Objects?
-
Weight decay separates memorization
Weight Decay Regimes in Grokking Transformers: Cheap Online Diagnostics
-
Tensor algebra recovers angular-momentum rules from molecules alone
Group-Algebraic Tensors: Provably-optimal Equivariant Learning and Physical Symmetry Discovery
-
Users beat AI by fixing its systematic errors
Can Conversational XAI Improve User Performance? An Experimental Study
-
Routing weights produce hierarchical attributions at zero cost
BOHM: Zero-Cost Hierarchical Attribution for Compound AI Systems
-
Contradiction graph decides VC dimension threshold for any m
Contradiction Graphs Determine VC Dimension
5 Piths -
Model update paths yield better uncertainty than final probabilities
Reading Calibrated Uncertainty from Language Model Trajectories
-
13 MB adapter beats larger cache translators for LLMs
Latent Cache Flow: Model-to-Model Communication Without Text
-
MLLMs infer fracture planes with Miller indices and reject invalid cases
Miller-Index-Based Latent Crystallographic Fracture Plane Reasoning and generation with Vision-Language Models
-
Supervised LDA boosts separability to 0.197 in plant phenomics data
Supervised Latent Restructuring for Small-Data Quantum Learning in Plant Phenomics
-
Spectral basis in LLMs allows online merging of preference policies
Spectral Souping: A Unified Framework for Online Preference Alignment
-
MXFP4 error splits into three parts for targeted RL fixes
Decomposing MXFP4 quantization error for LLM reinforcement learning: reducible bias, recoverable deadzone, and an irreducible floor
-
MXFP4 error splits into three parts each fixing a different RL failure
Decomposing MXFP4 quantization error for LLM reinforcement learning: reducible bias, recoverable deadzone, and an irreducible floor
-
Negative random effects group shows 400x larger causal effects
Understanding Deterioration Random Effects for Causal Discovery in Infrastructure Management
-
Scoring functions recover causal graphs with latent variables
Score-Based Causal Discovery of Latent Variable Causal Models
-
Tor network maintains fixed nine-dimensional structure over 67 days
Latent Geometry as a Structural Monitor: Eigenspace Alignment for Anomaly Detection in Anonymity Networks
-
Bigger 3D models trained on 50M driving scenes top Waymo leaderboard
STELLAR: Scaling 3D Perception Large Models for Autonomous Driving
-
Integral operators gain from longer windows in fMRI tasks
Nonlocal operator learning for fMRI encoding and decoding tasks
-
DEL raises LLM number prediction accuracy on math benchmarks
DEL: Digit Entropy Loss for Numerical Learning of Large Language Models
-
Per-sample temperatures make teacher soft labels consistent
Consistently Informative Soft-Label Temperature for Knowledge Distillation
-
Nudges to learnable states yield 7x larger skill gains than standard AI sharing
Proximal State Nudging: Reducing Skill Atrophy from AI Assistance
-
Symmetrized cross-entropy produces unique convex multi-class unhinged loss
Symmetrization of Loss Functions for Robust Training of Neural Networks in the Presence of Noisy Labels
-
Importance sampling corrects ILA to recover true posteriors
Corrected Integrated Laplace Approximation for Bayesian Inference in Latent Gaussian Models
-
Krylov approximation unlearns data 48x faster than retraining
Causal Unlearning in Collaborative Optimization: Exact and Approximate Influence Reversal under Adversarial Contributions
-
EEG microstates from one clustering step outperform traditional features on multiple tasks
Atoms of Thought: Universal EEG Representation Learning with Microstates
-
AUDITS benchmark tests detectors on 530K manipulated images
Multi-axis Analysis of Image Manipulation Localization
-
ML ensemble forecasts haor floods 72 hours ahead with 89.6% accuracy
HaorFloodAlert: Deseasonalized ML Ensemble for 72-Hour Flood Prediction in Bangladesh Haor Wetlands
-
Prototype layer matches ResNet accuracy on composite X-ray defects
Interpretable Computer Vision for Defect Detection in X-ray Tomography of Aerospace SiC/SiC Composites
-
Gating ensemble harvests reliable negatives for fraud models
SAGE: Scalable Automatic Gating Ensemble for Confident Negative Harvesting in Fraud Detection
-
Graph topology decides when models collapse
When Does Model Collapse Occur in Structured Interactive Learning?
-
Post-hoc calibration sharpens GP lower tails for optimization
Goal-Oriented Lower-Tail Calibration of Gaussian Processes for Bayesian Optimization
-
Repeating smaller datasets speeds up training
Less Data, Faster Training: repeating smaller datasets speeds up learning via sampling biases
-
Frozen encoder beats task-specific models on four trajectory tasks
TrajTok: Adaptive Spatial Tokenization for Trajectory Representation Learning
-
Streaming abstraction unifies DAS interactive analysis and production
FiLark: a streaming-first software framework for end-to-end exploration, annotation, and algorithm integration in distributed acoustic sensing
-
Recovery profiles reveal brain dimensions models miss despite high accuracy
Beyond Prediction Accuracy: Target-Space Recovery Profiles for Evaluating Model-Brain Alignment
-
Grid sketch achieves optimal Wasserstein runtime for smooth laws
Optimizing Computational-Statistical Runtime for Wasserstein Distance Estimation
-
Single recipe scales time series models from 4M to 2.5B parameters
Toto 2.0: Time Series Forecasting Enters the Scaling Era
-
Single trajectory yields neural k-inductive barriers for unknown dynamics
k-Inductive Neural Barrier Certificates for Unknown Nonlinear Dynamics
-
AutoML for health risk prediction reduces to few key components
A Reproducible Log-Driven AutoML Framework for Interpretable Pipeline Optimization in Healthcare Risk Prediction
-
No fixed marginal covariance is safe for all geometries in JEPAs
Beyond Isotropy in JEPAs: Hamiltonian Geometry and Symplectic Prediction
-
Optimal representation size shrinks with abundant pretraining data
Optimal Representation Size: High-Dimensional Analysis of Pretraining and Linear Probing
-
Pruning plus retrieval yields up to 5.41× speculative decoding speedups
Draft Less, Retrieve More: Hybrid Tree Construction for Speculative Decoding
-
Coupled graph model boosts damage localization in unseen plate areas
WaveGraphNet: Physics-Consistent Guided-Wave Damage Localization through Coupled Inverse-Forward Graph Learning