archive
Every paper Pith has read. Search by title, abstract, or pith.
14903 papers in cs.LG · page 12
-
Taxonomy-based generator yields verifiable planning data for LLMs
PlanningBench: Generating Scalable and Verifiable Planning Data for Evaluating and Training Large Language Models
-
Gradient moment method cuts 3D Gaussian count by 85-97%
CAdam: Context-Adaptive Moment Estimation for 3D Gaussian Densification in Generative Distillation
-
Runtime bounds certify quantized KV attention with exact fallback
Runtime-Certified Bounded-Error Quantized Attention
-
LOSCAR-SGD overlaps local steps with sparse delayed updates
LOSCAR-SGD: Local SGD with Communication-Computation Overlap and Delay-Corrected Sparse Model Averaging
-
N-step correction tightens PPO bound for RL with verifiable rewards
Multi-Step Likelihood-Ratio Correction for Reinforcement Learning with Verifiable Rewards
-
Cluster runtime cuts RLVR GPU costs up to 37.58%
PlexRL: Cluster-Level Orchestration of Serviceized LLM Execution for RLVR
-
Hypernetwork generates full robot policies from instructions alone
DISC: Decoupling Instruction from State-Conditioned Control via Policy Generation
-
DualOptim+ bridges shared and delta states to balance LLM forgetting and retention
DualOptim+: Bridging Shared and Decoupled Optimizer States for Better Machine Unlearning in Large Language Models
-
ReMax proves first sublinear regret bound for M=2 Gaussian bandits
Finite-Time Regret Analysis of Retry-Aware Bandits
-
Polynomial alternatives match activation-based vision models
Activation-Free Backbones for Image Recognition: Polynomial Alternatives within MetaFormer-Style Vision Models
-
DPO matches RLHF only if optimal policy favors human responses
Conditional Equivalence of DPO and RLHF: Implicit Assumption, Failure Modes, and Provable Alignment
-
Patching state centroids aligns transformer outputs with HMM counterfactuals
Markovian Circuit Tracing for Transformer State Dynamic
-
7B open LLMs run GraphRAG locally for EHR schema queries
GraphRAG on Consumer Hardware: Benchmarking Local LLMs for Healthcare EHR Schema Retrieval
-
OlmoEarth models cut training GPU hours by 1.7x
OlmoEarth v1.1: A more efficient family of OlmoEarth models
-
Preference vector tunes task balance in merged continual learning models
Tunable MAGMAX: Preference-Aware Model Merging for Continual Learning
-
Two GPU counters match MFU within 2 points at fleet scale
Instant GPU Efficiency Visibility at Fleet Scale
-
Contour images let CNNs pick black-box optimizers
Beyond Numerical Features: CNN-Driven Algorithm Selection via Contour Plots for Continuous Black-Box Optimization
-
Only two of 20 Transformer modifications transfer at 1-3B
Most Transformer Modifications Still Do Not Transfer at 1-3B: A 2020-2026 Update to Narang et al. (2021) with Downstream Evaluation and a Noise Floor
-
Local writes accumulate into global solutions in recursive reasoners
Interaction Locality in Hierarchical Recursive Reasoning
-
Intermediate alignment cuts physics residuals by 66% in diffusion models
Learning to Think in Physics: Breaking Shortcut Learning in Scientific Diffusion via Representation Alignment
-
Meta-learning from queries builds cumulative bias against spurious correlations
Cumulative Meta-Learning from Active Learning Queries for Robustness to Spurious Correlations
-
LLM interventions create user drift that biases simulated experiments
The Illusion of Intervention: Your LLM-Simulated Experiment is an Observational Study
-
Benchmark reveals optimizer rankings flip across shape problems
ShapeBench: A Scalable Benchmark and Diagnostic Suite for Standardized Evaluation in Aerodynamic Shape Optimization
-
New guidance resolves gradient conflicts in flow models
Conflict-Aware Additive Guidance for Flow Models under Compositional Rewards
-
Bias correction cuts pretraining loss in AdamW and similar optimizers
Correcting Stochastic Update Bias in Preconditioned Language Model Optimizers
-
Distillation from richer pseudo-samples improves sparse glucose estimates
PACD-Net: Pseudo-Augmented Contrastive Distillation for Glycemic Control Estimation from SMBG
-
GLU shrinks NTK condition number for faster convergence
The Devil is in the Condition Numbers: Why is GLU Better than non-GLU Structure?
-
Machine learning ties lncRNA features to type 2 diabetes
Multi-Modal Machine Learning for Population- and Subject-Specific lncRNA-Type 2 Diabetes Association Analysis
-
Hidden states at paragraph boundaries tune verifier strictness
The Hidden Signal of Verifier Strictness: Controlling and Improving Step-Wise Verification via Selective Latent Steering
-
Testbed embeds detectable hacks for automatic reward-gaming checks
Hack-Verifiable Environments: Towards Evaluating Reward Hacking at Scale
-
RL scores full distributions to fix LLM regression
Distribution-Aware Reward: Reinforcement Learning over Predictive Distributions for LLM Regression
-
Open-source iris algorithms pass first official IREX evaluation
Lowering the Barrier to IREX Participation: Open-Source Algorithms, Toolkit, and Benchmarking for Iris Recognition
-
Conformal tests bound false discoveries for every possible threshold
Everywhere Valid Bounds on False Discovery Proportions in Conformal Inference
-
Android crowds run large DNNs at 43 MB RAM per phone
Memory-Efficient Partitioned DNN Inference on Resource-Constrained Android Crowds
-
Group statistics adapt clipping and temperature to lift LLM math scores
AGPO: Adaptive Group Policy Optimization with Dual Statistical Feedback
-
GMM calibration lets recommenders use all noisy feedback
Robust Recommendation from Noisy Implicit Feedback: A GMM-Weighted Bayes-label Transition Matrix Framework
-
Decision path flips raise random forest accuracy
Decision-Path Patterns as Tree Reliability Signals: Path-based Adaptive Weighting for Random Forest Classification
-
Decision-path flips yield unbiased per-sample weights for random forests
Decision-Path Patterns as Tree Reliability Signals: Path-based Adaptive Weighting for Random Forest Classification
-
SAVER selectively activates vision to boost F1 and cut latency in multimodal IE
SAVER: Selective As-Needed Vision Evidence for Multimodal Information Extraction
-
Neural solver matches SOTA hypervolume at 40% less time
WeCon: An Efficient Weight-Conditioned Neural Solver for Multi-Objective Combinatorial Optimization Problems
-
WebGPU backend cuts LLM memory use by 29-33% in browsers
Llamas on the Web: Memory-Efficient, Performance-Portable, and Multi-Precision LLM Inference with WebGPU
-
Agentic system solves 8 of 10 research math problems
RMA: an Agentic System for Research-Level Mathematical Problems
-
DPO converges in distributed settings with rates set by communication and heterogeneity
Distributed Direct Preference Optimization
-
Self-limiting losses compress embeddings without overfitting
DIVE: Embedding Compression via Self-Limiting Gradient Updates
-
Deep learning clears motion in free-breathing heart MRI
Motion-Robust Deep Reconstruction for Free-Breathing Cardiac Cine MRI
-
Scale calibration makes median-of-means work for distributed PCA
Scale-Calibrated Median-of-Means for Robust Distributed Principal Component Analysis
-
Dynamic experts cut error on shifting time series
Dynamic TMoE: A Drift-Aware Dynamic Mixture of Experts Framework for Non-Stationary Time Series Forecasting
-
Frozen encoders plus tabular models hit SOTA on multimodal tasks
Modular Multimodal Classification Without Fine-Tuning: A Simple Compositional Approach
-
Looped transformers run linearly and outperform standard versions
LT2: Linear-Time Looped Transformers
-
AI reviewer beats top human on Nature papers
On the limits and opportunities of AI reviewers: Reviewing the reviews of Nature-family papers with 45 expert scientists