archive
Every paper Pith has read. Search by title, abstract, or pith.
14903 papers in cs.LG · page 16
-
Alternating Muon and Lion steps improves loss at lower compute
LionMuon: Alternating Spectral and Sign Descent for Efficient Training
-
Laplace diffusion generates long forecasts for irregular time series
Latent Laplace Diffusion for Irregular Multivariate Time Series
-
Stitched model lifts rewards to noisy latents for faster alignment
Stitched Value Model for Diffusion Alignment
-
Prototypes on the hypersphere reach neural collapse by design
Neural Collapse by Design: Learning Class Prototypes on the Hypersphere
-
Class prototypes on the hypersphere reach neural collapse by design
Neural Collapse by Design: Learning Class Prototypes on the Hypersphere
-
LLMs optimize code via priors
Prior Knowledge or Search? A Study of LLM Agents in Hardware-Aware Code Optimization
-
Conformal methods deliver distribution-free coverage for AI agent scores
Distribution-Free Uncertainty Quantification for Continuous AI Agent Evaluation
-
B-cos GNNs deliver exact per-node explanations after one forward pass
B-cos GNNs: Faithful Explanations through Dynamic Linearity
-
Variance-aware regret bound proven optimal for logistic MDPs
Minimax Optimal Variance-Aware Regret Bounds for Multinomial Logistic MDPs
-
Rank-1 queries keep ZO signals strong for high-rank LoRA
AR1-ZO: Topology-Aware Rank-1 Zeroth-Order Queries for High-Rank LoRA Fine-Tuning
-
Quadratic model handles heavy and light tailed noise
Robust Subspace-Constrained Quadratic Models for Low-Dimensional Structure Learning
-
Models distort physical quantity distributions despite plausible paths
Mechanisms of Misgeneralization in Physical Sequence Modeling
-
Aligning spectrum and molecule models improves metabolite retrieval
MSAlign: Aligning Molecule and Mass Spectra Foundation Models for Metabolite Identification
-
Ensemble ML classifies epilepsy in IED-free stimulation EEG
Classification of IED-free EEG Responses for Assisted Epilepsy Diagnosis
-
Multi-agent LLM framework hits 97 percent task completion on engineering benchmarks
EngiAI: A Multi-Agent Framework and Benchmark Suite for LLM-Driven Engineering Design
-
GNNs detect communities to aid graph signal interpolation
Graph Neural Networks for Community Detection in Graph Signal Analysis
-
Hydra keeps 95% attack success across 500 concept pairs in diffusion models
Awakening the Hydra: Stabilizing Multi-Concept Backdoor Injection in Text-to-Image Diffusion Models
-
CRP groups medical tasks from text for 73% Dice with 4% forgetting
MedCRP-CL: Continual Medical Image Segmentation via Bayesian Nonparametric Semantic Modality Discovery
-
Diffusion copula turns simultaneous crashes into expected events
Probabilistic Multivariate Time Series Forecasting with Diffusion Copulas
-
AI workflow finds cryomicroneedle mix with 95 percent viability
Agentic Discovery of Cryomicroneedle Formulations
-
Spectral filter repairs fine-tuning damage without retraining
Spectral Unforgetting: Post-Hoc Recovery of Damaged Capabilities Without Retraining
-
Consensus particles converge exponentially to bi-level optima
Convergence of Consensus-Based Particle Methods for Nonconvex Bi-Level Optimization
-
Dual-view net estimates cardiac output from short PPG
Cross-View Attention Fusion Net: A Prior-Guided Dual-View Representation Learning for Cardiac Output Estimation from Short-Term PPG Signals
-
OScaR reaches near-lossless INT2 KV cache quantization
OScaR: The Occam's Razor for Extreme KV Cache Quantization in LLMs and Beyond
-
Static quantization speeds LLM inference on mobile NPUs
Quant.npu: Enabling Efficient Mobile NPU Inference for on-device LLMs via Fully Static Quantization
-
BCI-sift toolbox picks neural features to raise decoding accuracy
BCI-sift: An automated feature selection toolbox for Brain Computer Interface applications
-
Knowledge graph embeddings leak sensitive user attributes
Inferring Sensitive Attributes from Knowledge Graph Embeddings: Attack and Defense Strategies
-
One LLM system optimizes text to beat specialists on six tasks
optimize_anything: A Universal API for Optimizing any Text Parameter
-
Hierarchical Gaussian filters close the gap in deep predictive coding
Closed-form predictive coding via hierarchical Gaussian filters
-
Federated stochastic approximation gets explicit Gaussian error bounds
Gaussian Approximation and Multiplier Bootstrap for Federated Linear Stochastic Approximation
-
Reconstruction error from linear queries limits to sqrt(2d/(d+1)) delta
Optimal Reconstruction from Linear Queries
-
Regularized graph diffusion yields stable EIT reconstructions
Diffusion Graph Posterior Sampling for Nonlinear Inverse Problems with Application to Electrical Impedance Tomography
-
MiMuon reaches O(1/N) generalization bound for matrix models
MiMuon: Mixed Muon Optimizer with Improved Generalization for Large Models
-
Divergence measures locate where tree surrogates lose fidelity
A Family of Divergence Measures for Evaluating the Reconstruction Quality of Explainable Ensemble Trees
-
Lévy B-spline posterior contracts near minimax rates in Besov spaces
Posterior Contraction of L\'evy Adaptive B-spline Regression in Besov Spaces
-
SVD-ordered paths yield less noisy model attributions
Spectral Integrated Gradients for Coarse-to-Fine Feature Attribution
-
Graph surrogate cuts dental aerosol rollout time by 37x
Physics-Informed Graph Neural Network Surrogates for Turbulent Nanoparticle Dispersion in Dental Clinical Environments
-
Tree paths turn irregular EHR data into traceable evidence
TreeText-CTS: Compact, Source-Traceable Tree-Path Evidence for Irregular Clinical Time-Series Prediction
-
Order-book no-trades yield square-root regret in market making
Online Market Making and the Value of Observing the Order Book
-
Trajectory selection gives 10x faster training and better out-of-domain web agents
Weasel: Out-of-Domain Generalization for Web Agents via Importance-Diversity Data Selection
-
First open high-fidelity CFD dataset for high-lift aircraft released
HiLiftAeroML: High-Fidelity Computational Fluid Dynamics Dataset for High-Lift Aircraft Aerodynamics
-
Neural warm starts triple speed of UAV-UGV handover planning
Learning-Accelerated Optimization-based Trajectory Planning for Cooperative Aerial-Ground Handover Missions
-
Rotations fix MXFP4 activation errors in LLMs
TORQ: Two-Level Orthogonal Rotation for MXFP4 Quantization
-
Decoupling network separates target and jamming in mixed HRRP
JointHRRP-Net: A Statistically Constrained Decoupling Network for Joint Target and Jamming Recognition in Composite Jamming
-
Density ratios enable adjustable post-hoc deferral
Density-Ratio Losses for Post-Hoc Learning to Defer
-
MILP solves fairness repair for neural networks with formal guarantees
Provable Fairness Repair for Deep Neural Networks
-
Inference backend shifts LLM benchmark scores by 16.6 points
The Silent Hyperparameter: Quantifying the Impact of Inference Backends on LLM Reproducibility
-
Inference backends shift LLM scores by up to 16.6 points
The Silent Hyperparameter: Quantifying the Impact of Inference Backends on LLM Reproducibility
-
Early core token attention ranks best seeds for text-to-image results
Boosting Text-to-Image Diffusion Models via Core Token Attention-Based Seed Selection
-
Base models fool AI detectors into rating text as human
Base Models Look Human To AI Detectors