archive

Every paper Pith has read. Search by title, abstract, or pith.

14903 papers in cs.LG · page 12

cs.AI 2026-05-20 reviewed

Taxonomy-based generator yields verifiable planning data for LLMs
PlanningBench: Generating Scalable and Verifiable Planning Data for Evaluating and Training Large Language Models

Ziliang Zhao +9
cs.LG 2026-05-20 reviewed

Gradient moment method cuts 3D Gaussian count by 85-97%
CAdam: Context-Adaptive Moment Estimation for 3D Gaussian Densification in Generative Distillation

SeungJeh Chung +3
cs.LG 2026-05-20 reviewed

Runtime bounds certify quantized KV attention with exact fallback
Runtime-Certified Bounded-Error Quantized Attention

Dean Calver
cs.LG 2026-05-20 reviewed

LOSCAR-SGD overlaps local steps with sparse delayed updates
LOSCAR-SGD: Local SGD with Communication-Computation Overlap and Delay-Corrected Sparse Model Averaging

Yassine Maziane +3
cs.LG 2026-05-20 reviewed

N-step correction tightens PPO bound for RL with verifiable rewards
Multi-Step Likelihood-Ratio Correction for Reinforcement Learning with Verifiable Rewards

Deokgyu Yoon +6
cs.DC 2026-05-20 reviewed

Cluster runtime cuts RLVR GPU costs up to 37.58%
PlexRL: Cluster-Level Orchestration of Serviceized LLM Execution for RLVR

Yiqi Zhang +15
cs.RO 2026-05-20 reviewed

Hypernetwork generates full robot policies from instructions alone
DISC: Decoupling Instruction from State-Conditioned Control via Policy Generation

Hanxiang Ren +3
cs.LG 2026-05-20 reviewed

DualOptim+ bridges shared and delta states to balance LLM forgetting and retention
DualOptim+: Bridging Shared and Decoupled Optimizer States for Better Machine Unlearning in Large Language Models

Xuyang Zhong +3
cs.LG 2026-05-20 reviewed

ReMax proves first sublinear regret bound for M=2 Gaussian bandits
Finite-Time Regret Analysis of Retry-Aware Bandits

Bingkui Tong +3
cs.CV 2026-05-20 reviewed

Polynomial alternatives match activation-based vision models
Activation-Free Backbones for Image Recognition: Polynomial Alternatives within MetaFormer-Style Vision Models

Jeffrey Wang +2
cs.AI 2026-05-20 reviewed

DPO matches RLHF only if optimal policy favors human responses
Conditional Equivalence of DPO and RLHF: Implicit Assumption, Failure Modes, and Provable Alignment

Zhiqin Yang +5
cs.LG 2026-05-20 reviewed

Patching state centroids aligns transformer outputs with HMM counterfactuals
Markovian Circuit Tracing for Transformer State Dynamic

Abdullah X
cs.CL 2026-05-20 reviewed

7B open LLMs run GraphRAG locally for EHR schema queries
GraphRAG on Consumer Hardware: Benchmarking Local LLMs for Healthcare EHR Schema Retrieval

Peter Fernandes +1
cs.CV 2026-05-20 reviewed

OlmoEarth models cut training GPU hours by 1.7x
OlmoEarth v1.1: A more efficient family of OlmoEarth models

Gabriel Tseng +9
cs.LG 2026-05-20 reviewed

Preference vector tunes task balance in merged continual learning models
Tunable MAGMAX: Preference-Aware Model Merging for Continual Learning

Kei Hiroshima +2
cs.DC 2026-05-20 reviewed

Two GPU counters match MFU within 2 points at fleet scale
Instant GPU Efficiency Visibility at Fleet Scale

Connor Pedersen +4
cs.LG 2026-05-20 reviewed

Contour images let CNNs pick black-box optimizers
Beyond Numerical Features: CNN-Driven Algorithm Selection via Contour Plots for Continuous Black-Box Optimization

Yiliang Yuan +2
cs.LG 2026-05-20 reviewed

Only two of 20 Transformer modifications transfer at 1-3B
Most Transformer Modifications Still Do Not Transfer at 1-3B: A 2020-2026 Update to Narang et al. (2021) with Downstream Evaluation and a Noise Floor

Yang Zhao +4
cs.AI 2026-05-20 reviewed

Local writes accumulate into global solutions in recursive reasoners
Interaction Locality in Hierarchical Recursive Reasoning

Yosuke Miyanishi +1
cs.LG 2026-05-20 reviewed

Intermediate alignment cuts physics residuals by 66% in diffusion models
Learning to Think in Physics: Breaking Shortcut Learning in Scientific Diffusion via Representation Alignment

Haozhe Jia +8
cs.LG 2026-05-20 reviewed

Meta-learning from queries builds cumulative bias against spurious correlations
Cumulative Meta-Learning from Active Learning Queries for Robustness to Spurious Correlations

Kin Whye Chew +1
cs.CL 2026-05-20 reviewed

LLM interventions create user drift that biases simulated experiments
The Illusion of Intervention: Your LLM-Simulated Experiment is an Observational Study

Victoria Lin +5
cs.LG 2026-05-20 reviewed

Benchmark reveals optimizer rankings flip across shape problems
ShapeBench: A Scalable Benchmark and Diagnostic Suite for Standardized Evaluation in Aerodynamic Shape Optimization

Shaghayegh Fazliani +5
cs.AI 2026-05-20 reviewed

New guidance resolves gradient conflicts in flow models
Conflict-Aware Additive Guidance for Flow Models under Compositional Rewards

Xuehui Yu +4
cs.LG 2026-05-20 reviewed

Bias correction cuts pretraining loss in AdamW and similar optimizers
Correcting Stochastic Update Bias in Preconditioned Language Model Optimizers

Nikhil Nayak +9
cs.LG 2026-05-20 reviewed

Distillation from richer pseudo-samples improves sparse glucose estimates
PACD-Net: Pseudo-Augmented Contrastive Distillation for Glycemic Control Estimation from SMBG

Canyu Lei +2
cs.LG 2026-05-20 reviewed

GLU shrinks NTK condition number for faster convergence
The Devil is in the Condition Numbers: Why is GLU Better than non-GLU Structure?

Xingyu Lyu +4
q-bio.GN 2026-05-20 reviewed

Machine learning ties lncRNA features to type 2 diabetes
Multi-Modal Machine Learning for Population- and Subject-Specific lncRNA-Type 2 Diabetes Association Analysis

Ashwani Siwach +2
cs.LG 2026-05-20 reviewed

Hidden states at paragraph boundaries tune verifier strictness
The Hidden Signal of Verifier Strictness: Controlling and Improving Step-Wise Verification via Selective Latent Steering

Yefan Zhou +5
cs.LG 2026-05-20 reviewed

Testbed embeds detectable hacks for automatic reward-gaming checks
Hack-Verifiable Environments: Towards Evaluating Reward Hacking at Scale

Amit Roth +4
cs.LG 2026-05-20 reviewed

RL scores full distributions to fix LLM regression
Distribution-Aware Reward: Reinforcement Learning over Predictive Distributions for LLM Regression

Jungsoo Park +6
cs.CV 2026-05-20 reviewed

Open-source iris algorithms pass first official IREX evaluation
Lowering the Barrier to IREX Participation: Open-Source Algorithms, Toolkit, and Benchmarking for Iris Recognition

Siamul Karim Khan +2
stat.ME 2026-05-20 reviewed

Conformal tests bound false discoveries for every possible threshold
Everywhere Valid Bounds on False Discovery Proportions in Conformal Inference

Ziang Song +2
cs.LG 2026-05-20 reviewed

Android crowds run large DNNs at 43 MB RAM per phone
Memory-Efficient Partitioned DNN Inference on Resource-Constrained Android Crowds

Lakshani Manamperi +4
cs.LG 2026-05-20 reviewed

Group statistics adapt clipping and temperature to lift LLM math scores
AGPO: Adaptive Group Policy Optimization with Dual Statistical Feedback

Miaobo Hu +7
cs.LG 2026-05-20 reviewed

GMM calibration lets recommenders use all noisy feedback
Robust Recommendation from Noisy Implicit Feedback: A GMM-Weighted Bayes-label Transition Matrix Framework

Zongyu Li +5
cs.LG 2026-05-20 reviewed

Decision path flips raise random forest accuracy
Decision-Path Patterns as Tree Reliability Signals: Path-based Adaptive Weighting for Random Forest Classification

Youngjoon Park
cs.LG 2026-05-20 reviewed

Decision-path flips yield unbiased per-sample weights for random forests
Decision-Path Patterns as Tree Reliability Signals: Path-based Adaptive Weighting for Random Forest Classification

Youngjoon Park
cs.CV 2026-05-20 reviewed

SAVER selectively activates vision to boost F1 and cut latency in multimodal IE
SAVER: Selective As-Needed Vision Evidence for Multimodal Information Extraction

Miaobo Hu +7
cs.LG 2026-05-20 reviewed

Neural solver matches SOTA hypervolume at 40% less time
WeCon: An Efficient Weight-Conditioned Neural Solver for Multi-Objective Combinatorial Optimization Problems

Xuan Wu +9
cs.DC 2026-05-20 reviewed

WebGPU backend cuts LLM memory use by 29-33% in browsers
Llamas on the Web: Memory-Efficient, Performance-Portable, and Multi-Precision LLM Inference with WebGPU

Reese Levine +7
cs.AI 2026-05-20 reviewed

Agentic system solves 8 of 10 research math problems
RMA: an Agentic System for Research-Level Mathematical Problems

Zelin Zhao +3
cs.LG 2026-05-20 reviewed

DPO converges in distributed settings with rates set by communication and heterogeneity
Distributed Direct Preference Optimization

Zhanhong Jiang
cs.CL 2026-05-20 reviewed

Self-limiting losses compress embeddings without overfitting
DIVE: Embedding Compression via Self-Limiting Gradient Updates

Dongfang Zhao
eess.IV 2026-05-20 reviewed

Deep learning clears motion in free-breathing heart MRI
Motion-Robust Deep Reconstruction for Free-Breathing Cardiac Cine MRI

Mahmut Yurt +10
stat.ME 2026-05-20 reviewed

Scale calibration makes median-of-means work for distributed PCA
Scale-Calibrated Median-of-Means for Robust Distributed Principal Component Analysis

Kisung You
cs.LG 2026-05-20 reviewed

Dynamic experts cut error on shifting time series
Dynamic TMoE: A Drift-Aware Dynamic Mixture of Experts Framework for Non-Stationary Time Series Forecasting

Jiawen Zhu +3
cs.LG 2026-05-20 reviewed

Frozen encoders plus tabular models hit SOTA on multimodal tasks
Modular Multimodal Classification Without Fine-Tuning: A Simple Compositional Approach

Herman Bergstr\"om +2
cs.LG 2026-05-20 reviewed

Looped transformers run linearly and outperform standard versions
LT2: Linear-Time Looped Transformers

Chunyuan Deng +6
cs.CL 2026-05-20 reviewed

AI reviewer beats top human on Nature papers
On the limits and opportunities of AI reviewers: Reviewing the reviews of Nature-family papers with 45 expert scientists

Seungone Kim +57