archive

Every paper Pith has read. Search by title, abstract, or pith.

14903 papers in cs.LG · page 3

cs.LG 2026-05-22 reviewed

Prefix prompts let frozen LLMs condition flows for multi-modal forecasts
PaP-NF: Probabilistic Long-Term Time Series Forecasting via Prefix-as-Prompt Reprogramming and Normalizing Flows

Minju Kim +1
cs.LG 2026-05-22 reviewed

Kernel agents top out at 0.94x production baselines
FastKernels: Benchmarking GPU Kernel Generation in Production

Gabriele Oliaro +7
cs.CV 2026-05-22 reviewed

Homography mapping yields linear bounds for camera motion verification
Lipschitz Optimization for Formal Verification of Homographies

Jean-Guillaume Durand +3
cs.LG 2026-05-22 reviewed

Region quotas stop wipe-out of reasoning blocks in KV caches
Adaptive Mass-Segmented KV Compression for Long-Context Reasoning

Junzhe Yang +1
cs.LG 2026-05-22 reviewed

Small labeled set plus pseudo-labels prunes datasets effectively
Label-Efficient Dataset Pruning via Semi-Supervised Pseudo-Labeling

Yeseul Cho +3
cs.LG 2026-05-22 reviewed

Pretrained graph model improves low-data OPF accuracy
Scalable Heterogeneous Graph Foundation Models for Data-Driven Optimal Power Flow in Smart Grids

Massimiliano Lupo Pasini +3
cs.LG 2026-05-22 reviewed

RankElastor stabilizes rank trajectories for scaled recommenders
Expand More, Shrink Less: Shaping Effective-Rank Dynamics for Dense Scaling in Recommendation

Guoming Li +9
cs.LG 2026-05-22 reviewed

r-value scores shrink conformal sets by excluding unstable candidates
Empirical Bayes Conformal Prediction for Vision and Language Models

Jiapeng Zeng +4
cs.LG 2026-05-22 reviewed

GPI finds good policies with cost independent of state space size
Pure Exploration for a Good Policy in Reinforcement Learning with Bandit Feedback

Zitian Li +1
cs.CL 2026-05-22 reviewed

Optimizing prompt embeddings boosts in-context learning
Self-Improving In-Context Learning

Baturay Saglam +1
cs.LG 2026-05-22 reviewed

Symmetric noise lifts AlpacaEval scores from 65% to 69% in fine-tuning
Understanding and Improving Noisy Embedding Techniques in Instruction Finetuning

Abhay Yadav
cs.CL 2026-05-22 reviewed

LLMs drop up to 88 points when tasks move to context middle
Positional Failures in Long-Context LLMs: A Blind Spot in Reasoning Benchmarks

Chuyifei Zhang +3
cs.CR 2026-05-22 reviewed

10 poisoned examples hijack targeted LLM tasks at 70%+ success
PoisonForge: Task-Level Targeted Poisoning Benchmark for Instruction-Tuned LLMs

Luze Sun +4
cs.CR 2026-05-22 reviewed

ActInv recovers inputs from LLM split-inference activations
What Does the Server See? Understanding Privacy Leakage from Large Language Models in Split Inference

Mingyuan Fan +3
cs.LG 2026-05-22 reviewed

Limit space makes any-size input models universal
Any-Dimensional Invariant Universality

Shengtai Yao +2
cs.LG 2026-05-22 reviewed

Infra-Bayesian RL records lower worst-case regret than classical agents
Infra-Bayesian Reinforcement Learning Agents Outperform Classical RL For Worst-Case Robustness

Manish Aryal +12
stat.ML 2026-05-22 reviewed

Gradient descent recovers true similarity metric from triplets
Operationalizing Individual Fairness via Gradient Descent and Bradley-Terry Models

Conlan Olson +3
cs.LG 2026-05-22 reviewed

Channel relevance steers contrastive samples for time series anomaly detection
CALAD: Channel-Aware contrastive Learning for multivariate time series Anomaly Detection

Jaehyeop Hong +1
quant-ph 2026-05-22 reviewed

RL selects Clifford states that boost VQA energy accuracy 3x on average
Classical State Preparation for Variational Quantum Algorithms via Reinforcement Learning

Gino Kwun +2
cs.LG 2026-05-22 reviewed

Taylor-mode AD powering yields exact nested copula likelihoods
Archimedean Copula Inference via Taylor-Mode AD

Cambridge Yang +1
cs.LG 2026-05-22 reviewed

Rayleigh quotient fixes rare switching under privacy noise
When Determinants Are Not Enough: Private Rare Switching

Xingyu Zhou
cs.CV 2026-05-22 reviewed

Verified prompts plus longitudinal context raise lesion tracking Dice by 4.5 points
Exploiting Longitudinal Context in Clinician-Verified Interactive Lesion Tracking

Yannick Kirchhoff +7
cs.LG 2026-05-22 reviewed

Gen-ROTDA adapts bike-sharing demand models across years by anchoring on few target labels
Robust OT-Guided Generative Residual Domain Adaptation for Bike-Sharing Demand Prediction under Temporal Domain Shift

Yiming Ma
stat.ML 2026-05-21 reviewed

LLM Sparsity Prior lets spike-and-slab models ignore bad LLM weights
LLM Sparsity Prior for Robust Feature Selection

Caleb Skinner +2
cs.CR 2026-05-21 reviewed

Certified bounds eliminate overflows in encrypted neural nets
Encrypted Neural Networks without Overflows

Philipp Kern +5
cs.LG 2026-05-21 reviewed

Jacobian penalty on latent dynamics raises sample efficiency in DreamerV3
Dreaming Smoothly and Sample Efficiently with Gradient Penalized Latent Dynamics

Romil V. Sonigra (1) +1
cs.LG 2026-05-21 reviewed

Depth biases networks toward low-rank softmax codes
The Implicit Bias of Depth: From Neural Collapse to Softmax Codes

Connall Garrod +2
stat.ML 2026-05-21 reviewed

KAN estimator converges independent of covariate dimension
KAPLAN: Kolmogorov-Arnold Prognostic Learnable Activation Networks for Survival Analysis

Stelios Boulitsakis Logothetis +2
cs.LG 2026-05-21 reviewed

5% FP16 blocks recover 89% of FP4-to-FP16 attention quality gap
ThriftAttention: Selective Mixed Precision for Long-Context FP4 Attention

Joe Sharratt
cs.LG 2026-05-21 reviewed

Attribution contract resolves ambiguity in generative model explanations
The Attribution Contract: Feature Attribution for Generative Language Models

Giang Nguyen
cs.LG 2026-05-21 reviewed

Global LP ranks every MoE expert to cut memory at low bits
GEMQ: Global Expert-Level Mixed-Precision Quantization for MoE LLMs

Jianing Deng +6
cs.DC 2026-05-21 reviewed

Orbax speeds JAX checkpoint saves up to 3.5x over PyTorch
Orbax: Distributed Checkpointing with JAX

Colin Gaffney +15
cs.CV 2026-05-21 reviewed

Dithering defends vision models against adversarial attacks
Dithering Defense: Adversarial Robustness of Vision Foundation Models via Multi-Level Floyd-Steinberg Dithering

Yury Belousov +3
cs.CV 2026-05-21 reviewed

Vertex weights let mmWave data drive accurate SMPL body fits
Millimeter-wave Imaging for Anthropometric Body Measurement

Miriam Senne +4
cs.LG 2026-05-21 reviewed

One config matches tuned AdamW across 1-8x horizons on LLMs
Anytime Training with Schedule-Free Spectral Optimization

Anuj Apte +4
cs.LG 2026-05-21 reviewed

Controller routes LLM requests to best mode for 2x speedup
ModeSwitch-LLM: A Lightweight Phase-Aware Controller for Cross-Mode LLM Inference on a Single GPU

Aman Sunesh +2
cs.LG 2026-05-21 reviewed

Recognition of evaluations depends on model-benchmark pairs
Decomposing and Measuring Evaluation Awareness

Changling Li +5
cs.CL 2026-05-21 reviewed

Compositionality rises then falls in LLM self-training
Model Collapse as Cultural Evolution

Dongxin Guo +2
cs.CV 2026-05-21 reviewed

Motion data alone rivals video models trained on 10000x more examples
The TIME Machine: On The Power of Motion for Efficient Perception

Mantas Skackauskas +2
cs.LG 2026-05-21 reviewed

Sparse query gradients steer LLM paths and feedback levels
Steered Generation via Gradient-Based Optimization on Sparse Query Features

Sumanta Bhattacharyya +1
cs.CL 2026-05-21 reviewed

LLMs learn what not to say via frequency competition
Do Language Models Know What Not to Say? Causal Evidence for Statistical Preemption in LLMs

Dongxin Guo +2
cs.LG 2026-05-21 reviewed

Open datasets and software released for thermal-fluid AI
Open Multimodal Datasets and Open-Source Software for Data-Driven Modeling of Multiphase Transport and Thermal Systems

Christy Dunlap +8
cs.LG 2026-05-21 reviewed

Intermediate layers hold more task info than final layers
Uncovering the Latent Potential of Deep Intermediate Representations

Arnesh Batra +4
cs.LG 2026-05-21 reviewed

RADAR forecasts transfer by comparing representation trajectories
RADAR: Relative Angular Divergence Across Representations

Xavier Cadet +2
cs.LG 2026-05-21 reviewed

Latent states let transformer adapt to time-series contexts without quadratic cost
World Machine: Towards Generative World Modeling for Time-Series

Elton Cardoso do Nascimento +4
cs.AI 2026-05-21 reviewed

Transformers have fixed accuracy limits set by layers and width
The Deterministic Horizon: Impossibility Results as Design Specifications for Trustworthy AI Systems

Dongxin Guo
cs.LG 2026-05-21 reviewed

Small models evolve their own agents via two-timescale updates
PACE: Two-Timescale Self-Evolution for Small Language Model Agents

Chen Ling +6
cs.LG 2026-05-21 reviewed

Lipschitz intermediaries enable approximate calibration of discrete properties
Smoothed Elicitation Complexity for Approximate $\Gamma$-calibration of Discrete Classification Tasks

Jessica Finocchiaro +2
q-fin.TR 2026-05-21 reviewed

LLM evolutionary optimizer boosts Bitcoin trading in backtests
MadEvolve: Evolutionary Optimization of Trading Systems with Large Language Models

Yurii Kvasiuk +3
q-bio.NC 2026-05-21 reviewed

Active sensing serves task control
Active Sensing Subserves Task-Level Control

Andrew Lamperski +5