archive
Every paper Pith has read. Search by title, abstract, or pith.
14903 papers in cs.LG · page 17
-
Context management determines real-world Transformer Turing-completeness
Position: The Turing-Completeness of Autoregressive Transformers Relies Heavily on Context Management
-
Game creatures become RL testbeds in new MuJoCo suite
ARC-RL: A Reinforcement Learning Playground Inspired by ARC Raiders
-
One reward function trains policies for four game robots
ARC-RL: A Reinforcement Learning Playground Inspired by ARC Raiders
-
Two time scales in SGD cause memorization in generative models
Adynamical systems view of training generativemodels and the memorization phenomenon
-
TokenDrift cuts Gen-PPL by 89% at 4 steps in DDLMs
Drifting Objectives for Refining Discrete Diffusion Language Models
-
Finite dynamics samples enforce safety during RL learning
Sampling-Based Safe Reinforcement Learning
-
Pre-training boosts time series detection by 375% but not forecasting
Quantifying the Pre-training Dividend: Generative versus Latent Self-Supervised Learning for Time Series Foundation Models
-
Mirror maps reach same max-margin with sparse or dense features
Implicit Bias of Mirror Flow in Homogeneous Neural Networks: Sparse and Dense Feature Learning
-
Spiking blocks replace Transformer nonlinearities with <1% accuracy drop
Plug-and-Play Spiking Operators: Breaking the Nonlinearity Bottleneck in Spiking Transformers
-
Majority vote locks wrong answers after brief correct window in TTRL
Detecting and Mitigating the Correct-Answer Extinction Window in Test-Time Reinforcement Learning with Majority Voting
-
CEPO boosts math reasoning to 43.43% at 2B and 60.56% at 4B
CEPO: RLVR Self-Distillation using Contrastive Evidence Policy Optimization
-
Model fuses layout and netlist to predict cell delay at 0.92% error
FusionCell: Cross-Attentive Fusion of Layout Geometry and Netlist Topology for Standard-Cell Performance Prediction
-
Output-layer gradient norm gates reuse to cut RLVR samples by 2.93x
When to Stop Reusing: Dynamic Gradient Gating for Sample-Efficient RLVR
-
Pilot-only model beats full-CSI baselines across frequencies
PilotWiMAE: Pilot-Native Representation Learning for Wireless Channels
-
Adaptive tuning raises LLM jailbreak harm scores from 6% to 70%
Adaptive Probe-based Steering for Robust LLM Jailbreaking
-
Feedback prefixing improves LLM scaling by up to 2.8x efficiency
Introspective X Training: Feedback Conditioning Improves Scaling Across all LLM Training Stages
-
ODE traces low-loss paths for sequential model merging
Unlocking the Potential of Continual Model Merging: An ODE Perspective
-
ODE paths limit forgetting when merging models sequentially
Unlocking the Potential of Continual Model Merging: An ODE Perspective
-
Large models improve with unfiltered low-quality data
A Bitter Lesson for Data Filtering
-
TIDE halves training time and lifts perturbed ImageNet accuracy by 1.65%
TIDE: Asymmetric Neural Circuits for Stabilized Temporal Inhibitory-Excitatory Dynamics
-
JUDO outperforms GPT-4o on industrial anomaly QA with normal image references
JUDO: A Juxtaposed Domain-Oriented Multimodal Reasoner for Industrial Anomaly QA
-
Variance penalty on penultimate neurons cuts medical AI bias
Neuron Incidence Redistribution for Fairness in Medical Image Classification
-
Adam momentum reverses roles in zero-sum games
Understanding Dynamics of Adam in Zero-Sum Games: An ODE Approach
-
Tweedie formulae now cover non-Gaussian diffusions
Tweedie's Formulae and Diffusion Generative Models Beyond Gaussian
-
AI inference costs multiply Phillips curve slope by lambda-bar
The Economics of AI Inference: Inflation Dynamics, Welfare Costs, and Optimal Monetary Policy under the Inference-Cost Phillips Curve
-
LLM safety benchmarks are orbits under group actions
The Evaluation Game: Beyond Static LLM Benchmarking
-
Concept ontology filters noisy negatives to lift chest X-ray zero-shot tasks
Concept-Guided Noisy Negative Suppression for Zero-Shot Classification and Grounding of Chest X-Ray Findings
-
Deep learning outperforms physics models on floods and weather
Accurate, Efficient, and Explainable Deep Learning Approaches for Environmental Science Problems
-
Optical pass checks 15 deepfake videos simultaneously
Scalable, Energy-Efficient Optical-Neural Architecture for Multiplexed Deepfake Video Detection
-
Atlas text boosts mammography BI-RADS accuracy
MAM-CLIP: Vision-Language Pretraining on Mammography Atlases for BI-RADS Classification
-
Closed-form subsidy maximizes welfare under model collapse
The Economics of Model Collapse: Equilibrium, Welfare, and Optimal Provenance Subsidies in Synthetic Data Markets
-
Repositioned anchors keep motion contacts across body shapes
Skinned Motion Retargeting with Spatially Adaptive Interaction Guidance
-
Action models align asymmetrically with brain action signals
Brain alignment of reasoning and action representations from vision-language and action models during naturalistic gameplay
-
Bounding box layouts generate editable 3D parts
CompoSE: Compositional Synthesis and Editing of 3D Shapes via Part-Aware Control
-
Claim differences as RL rewards balance caption hallucinations and omissions
ClaimDiff-RL: Fine-Grained Caption Reinforcement Learning through Visual Claim Comparison
-
Supreme Court quashes 18 points more matrimonial petitions than Karnataka HC
IMLJD: A Computational Dataset for Indian Matrimonial Litigation Analysis
-
Disentangling signals improves single-cell perturbation forecasts
What Makes a Representation Good for Single-Cell Perturbation Prediction?
-
Benchmark labels hallucinations via explicit reference worlds
HalluWorld: A Controlled Benchmark for Hallucination via Reference World Models
5 Piths -
Protein Thoughts ranks true binders at mean position 11.2
Protein Thoughts: Interpretable Reasoning with Tree of Thoughts and Embedding-Space Flow Matching for Protein-Protein Interaction Discovery
-
Unified signals close the gap between centralized and federated learning
OmniISR: A Unified Framework for Centralized and Federated Learning via Intermediate Supervision and Regularization
-
LLMs close 99% of deals but earn low profits in hidden pricing
PrefBench: Evaluating Zero-Shot LLM Agents in Hidden-Preference Personalized Pricing Negotiations
-
MOCHA improves agent skill correctness on every task
MOCHA: Multi-Objective Chebyshev Annealing for Agent Skill Optimization
-
Exterior rotation improves NMF convergence and accuracy
An Exterior Method for Nonnegative Matrix Factorization
-
Sheaf neural ODE forecasts brain dynamics from graphs
BrainDyn: A Sheaf Neural ODE for Generative Brain Dynamics
-
Partial re-noising raises Sudoku accuracy from 56% to 75%
Inference-Time Scaling in Diffusion Models through Iterative Partial Refinement
-
Method clusters subjects and learns their distinct causal graphs
A Unified Framework for Structure-Aware Clustering and Heterogeneous Causal Graph Learning
-
LSTM needs more noise separation than EM for reliable classification
An Objective Performance Evaluation of the LSTM Networks in Time Series Classification
-
Adaptive penalty proves convergence for feasible Pareto hypernetworks
A Two-Phase Adaptive Balanced Penalty Method for Controllable Pareto Front Learning under Split Feasibility Conditions
-
Matérn noise gives flow matching triangulation-agnostic behavior
Mat\'ern Noise for Triangulation-Agnostic Flow Matching on Meshes
-
RF and DNN share knowledge in both directions effectively
Cross-Paradigm Knowledge Distillation: A Comprehensive Study of Bidirectional Transfer Between Random Forests and Deep Neural Networks for Big Data Applications