pith. sign in

archive

Every paper Pith has read. Search by title, abstract, or pith.

14903 papers in cs.LG · page 10

  1. q-bio.NC 2026-05-20 reviewed
    Stimulus symmetries produce distinct RSMs for equivalent codes

    Stimulus symmetries can confound representational similarity analyses

    Farhad Pashakhanloo +1

  2. cs.LG 2026-05-20 reviewed
    Clients pick own models to cut federated comms 44x and raise accuracy

    Optimized Federated Knowledge Distillation with Distributed Neural Architecture Search

    Chaimaa Medjadji +5

  3. cs.CL 2026-05-20 reviewed
    Regularization curbs prompt overfitting for better LLM generalization

    TextReg: Mitigating Prompt Distributional Overfitting via Regularized Text-Space Optimization

    Lucheng Fu +6

  4. cs.LG 2026-05-20 reviewed
    CRAFT projects updates to resolve conflicts in federated learning

    CRAFT: Conflict-Resolved Aggregation for Federated Training

    Ziqi Wang +2

  5. cs.LG 2026-05-20 reviewed
    Bernoulli metrics distinguish memorized from generalizing networks

    A New Framework to Analyse the Distributional Robustness of Deep Neural Networks

    Divij Khaitan +1

  6. cs.DC 2026-05-20 reviewed
    Simulator predicts LLM serving latency with 6% error

    Frontier: Towards Comprehensive and Accurate LLM Inference Simulation

    Yicheng Feng +5

  7. cs.LG 2026-05-20 reviewed
    RL cuts pedestrian waits 79% via better crosswalks and signals

    DeCoR: Design and Control Co-Optimization for Urban Streets Using Reinforcement Learning

    Bibek Poudel +4

  8. cs.LG 2026-05-20 reviewed
    Inductive logic turns neural circuit findings into transferable theories

    From Circuit Evidence to Mechanistic Theory: An Inductive Logic Approach

    Nura Aljaafari +2

  9. cs.LG 2026-05-20 reviewed
    Contrasting patients with controls isolates disease subgroups

    Automatic Discovery of Disease Subgroups by Contrasting with Healthy Controls

    Robin Louiset +4

  10. cs.LG 2026-05-20 reviewed
    Semantic route cuts mental health prediction error across datasets

    TimeSRL: Generalizable Time-Series Behavioral Modeling via Semantic RL-Tuned LLMs -- A Case Study in Mental Health

    Yuang Fan +10

  11. stat.ML 2026-05-20 reviewed
    Large learning rates alter transformer attractors to cycles and chaos

    Large-Step Training Dynamics of a Two-Factor Linear Transformer Model

    Krishnakumar Balasubramanian

  12. cs.LG 2026-05-20 reviewed
    Tabular models use distinct similarity readouts despite matching accuracy

    A Mechanistic Study of Tabular Foundation Models

    Marin Bilo\v{s} +3

  13. cs.LG 2026-05-20 reviewed
    One-step generative policies add multimodal actions to mirror descent RL

    Stochastic MeanFlow Policies: One-Step Generative Control with Entropic Mirror Descent

    Zeyuan Wang +8

  14. cs.LG 2026-05-20 reviewed
    One-step MeanFlow policies beat Gaussian baselines in RL

    Stochastic MeanFlow Policies: One-Step Generative Control with Entropic Mirror Descent

    Zeyuan Wang +8

  15. cs.LG 2026-05-20 reviewed
    PCA loss matches supervised accuracy in unsupervised feature selection

    Objective-Induced Bias and Search Dynamics in Multiobjective Unsupervised Feature Selection

    Mathieu Cherpitel +3

  16. cs.LG 2026-05-20 reviewed
    LLM agents design MCU neural nets in hours instead of days

    AutoMCU: Feasibility-First MCU Neural Network Customization via LLM-based Multi-Agent Systems

    Penglin Dai +5

  17. cs.LG 2026-05-20 reviewed
    Moderate warm-up lets offline DPO surpass online RL on math reasoning

    How Much Online RL is Enough? Informative Rollouts for Offline Preference Optimization in RLVR

    Richa Verma +1

  18. cs.LG 2026-05-20 reviewed
    Dual-level experts reach 78% global accuracy in federated learning

    FedCoE: Bridging Generalization and Personalization via Federated Coordinated Dual-level MoEs

    Penglin Dai +5

  19. cs.LG 2026-05-20 reviewed
    Pricing learns demand from one revenue and resets for shifts

    Nonparametric Learning and Earning with One-Point Feedback under Nonstationarity

    Xiangyu Yang +3

  20. cs.LG 2026-05-20 reviewed
    Chain of thought splits into benefit and cost with stability bounds

    On the Cost and Benefit of Chain of Thought: A Learning-Theoretic Perspective

    Yue Zhang +3

  21. stat.ML 2026-05-20 reviewed
    Wasserstein bounds set tuning rules for annealed Langevin in SBI

    Theoretical guidelines for annealed Langevin dynamics in compositional simulation-based inference

    Camille Touron +3

  22. cs.LG 2026-05-20 reviewed
    Fluid-inspired velocity field fixes oversmoothing in deep GNNs

    Graph Navier Stokes Networks

    Zexing Zhao +6

  23. cs.LG 2026-05-20 reviewed
    Contrast sub-blocks in windows to learn time series features

    Divide and Contrast: Learning Robust Temporal Features without Augmentation

    Abdul-Kazeem Shamba +2

  24. cs.LG 2026-05-20 reviewed
    Strategy-map DAG keeps self-evolving agents from repeating old routines

    APEX: Autonomous Policy Exploration for Self-Evolving LLM Agents

    Yibo Li +7

  25. cs.LG 2026-05-20 reviewed
    10% heads on 10% data deliver 8.3 pp gain with 7x speedup in LLM alignment

    From Parameters to Data: A Task-Parameter-Guided Fine-Tuning Pipeline for Efficient LLM Alignment

    Hao Chen +9

  26. cs.LG 2026-05-20 reviewed
    Octahedral triplet quantizer trims KV cache bits

    OCTOPUS: Optimized KV Cache for Transformers via Octahedral Parametrization Under optimal Squared error quantization

    Mark Boss +3

  27. cs.LG 2026-05-20 reviewed
    Preference tuning cuts RL policy failures by over 60%

    PREFINE: Preference-Based Implicit Reward and Cost Fine-Tuning for Safety Alignment

    Richa Verma +3

  28. stat.ML 2026-05-20 reviewed
    Decomposition recovers shared LoRA subspace across clients

    Federated LoRA Fine-Tuning for LLMs via Collaborative Alignment

    Shuaida He +2

  29. cs.LG 2026-05-20 reviewed
    QED makes RL policies 100 times more consistent across runs

    Behavior-Consistent Deep Reinforcement Learning

    Marcel Hussing +4

  30. cs.LG 2026-05-20 reviewed
    QED cuts cross-run divergence in RL by two orders of magnitude

    Behavior-Consistent Deep Reinforcement Learning

    Marcel Hussing +4

  31. quant-ph 2026-05-20 reviewed
    Quantum RL matches classical on chemical flowsheet design

    Enhanced Reinforcement Learning-based Process Synthesis via Quantum Computing

    Austin Braniff (1) +7

  32. eess.SY 2026-05-20 reviewed
    YANN-RL cuts training time for chemical process control

    Reinforcement Learning-based Control via Y-wise Affine Neural Networks: Comparative Case Studies for Chemical Processes

    Austin Braniff +1

  33. cs.LG 2026-05-20 reviewed
    RL fine-tuning lifts code generation pass@1 by 19% on MBPP

    Domain-Adaptable Reinforcement Learning for Code Generation with Dense Rewards

    Erfan Aghadavoodi Jolfaei +4

  34. stat.ML 2026-05-20 reviewed
    Adaptive batch scaling unlocks large-batch RL

    Scalable Reinforcement Learning via Adaptive Batch Scaling

    Jongchan Park

  35. cs.LG 2026-05-20 reviewed
    ChunkFT fits full fine-tuning of 8B models in 14GB GPU memory

    ChunkFT: Byte-Streamed Optimization for Memory-Efficient Full Fine-Tuning

    Yongkang Liu +9

  36. stat.ML 2026-05-20 reviewed
    Gradient similarities unify measures of model complexity

    A Rigorous, Tractable Measure of Model Complexity

    Oskar Allerbo +1

  37. cs.LG 2026-05-20 reviewed
    Multi-slot ad matching lifts revenue per user nearly 29 percent

    Beyond Single Slot: Joint Optimization for Multi-Slot Guaranteed Display Advertising

    Zhaoqi Zhang +7

  38. cs.LG 2026-05-20 reviewed
    Quantum circuit generator cuts mismatch in synthetic fraud data

    Q-SYNTH: Hybrid Quantum-Classical Adversarial Augmentation for Imbalanced Fraud Detection

    Adam Innan +4

  39. cs.LG 2026-05-20 reviewed
    Backward data generation lets compact model beat Mathematica on first integrals

    Learning First Integrals via Backward-Generated Data and Guided Reinforcement Learning

    Jingfeng Zhong +3

  40. cs.CV 2026-05-20 reviewed
    YOLOv11 detects military targets in synthetic thermal and night drone images

    Comparative Analysis of Military Detection Using Drone Imagery Across Multiple Visual Spectrums

    Sourov Roy Shuvo +5

  41. cs.CL 2026-05-20 reviewed
    Fine-tuned LLM reaches 0.866 F1 on Spanish psychiatric ICD coding

    Automated ICD Classification of Psychiatric Diagnoses: From Classical NLP to Large Language Models

    Fernando Ortega +5

  42. cs.LG 2026-05-20 reviewed
    SMoA outperforms LoRA in low-budget fine-tuning

    SMoA: Spectrum Modulation Adapter for Parameter-Efficient Fine-Tuning

    Yongkang Liu +9

  43. cs.SD 2026-05-20 reviewed
    Model separates animal, natural, and human sounds in field recordings

    CoarseSoundNet: Building a reliable model for ecological soundscape analysis

    Alexander Gebhard +6

  44. cs.SD 2026-05-20 reviewed
    Deep learning model separates animal

    CoarseSoundNet: Building a reliable model for ecological soundscape analysis

    Alexander Gebhard +6

  45. cs.CV 2026-05-20 reviewed
    Cognitive-physical RL adds foresight to safer driving policies

    Distill to Think, Foresee to Act: Cognitive-Physical Reinforcement Learning for Autonomous Driving

    Yang Wu +5

  46. cs.CV 2026-05-20 reviewed
    CoPhy RL framework reaches SOTA on NAVSIM with BEV foresight

    Distill to Think, Foresee to Act: Cognitive-Physical Reinforcement Learning for Autonomous Driving

    Yang Wu +5

  47. cs.LG 2026-05-20 reviewed
    Fine-tuning erases reasoning traces while answers stay correct

    Reasoning-Trace Collapse: Evaluating the Loss of Explicit Reasoning During Fine-Tuning

    Lukas Twist +2

  48. cs.LG 2026-05-20 reviewed
    Virtual samples cut advantage collapse in GRPO by over half

    Advantage Collapse in Group Relative Policy Optimization: Diagnosis and Mitigation

    Xixiang He +7

  49. cs.CV 2026-05-20 reviewed
    Linear utility improves DPO for diffusion and flow image models

    Linear-DPO: Linear Direct Preference Optimization for Diffusion and Flow-Matching Generative Models

    Kesong Li +5

  50. cs.DC 2026-05-20 reviewed
    FLECA defends decentralized EV learning from attacks

    Automated Byzantine-Resilient Clustered Decentralized Federated Learning for Battery Intelligence in Connected EVs

    Mouhamed Amine Bouchiha +2