pith. sign in

archive

Every paper Pith has read. Search by title, abstract, or pith.

14513 papers in cs.AI · page 12

  1. cs.AI 2026-05-19 reviewed
    AgentCo-op links existing agents into genomics workflows without redesign

    AgentCo-op: Retrieval-Based Synthesis of Interoperable Multi-Agent Workflows

    Shuaike Shen +4

  2. cs.AI 2026-05-19 reviewed
    RL method raises ToM accuracy from 0.2% to 76% on asymmetric tasks

    OSCToM: RL-Guided Adversarial Generation for High-Order Theory of Mind

    Sharmin Sultana Srishty +4

  3. cs.CL 2026-05-19 reviewed
    CoT prompting leaves gender bias inside LLMs

    Mechanics of Bias and Reasoning: Interpreting the Impact of Chain-of-Thought Prompting on Gender Bias in LLMs

    Edie Pearman +5

  4. eess.IV 2026-05-19 reviewed
    This paper tests episodic sampling to build class-balanced batches for CT body…

    Disentangling Sampling from Training Budget in Class-Imbalanced CT Body Composition Segmentation

    Iason Skylitsis +2

  5. cs.LG 2026-05-19 reviewed
    MXFP4 error splits into three parts each fixing a different RL failure

    Decomposing MXFP4 quantization error for LLM reinforcement learning: reducible bias, recoverable deadzone, and an irreducible floor

    Xiaocan Li +2

  6. cs.LG 2026-05-19 reviewed
    MXFP4 error splits into three parts for targeted RL fixes

    Decomposing MXFP4 quantization error for LLM reinforcement learning: reducible bias, recoverable deadzone, and an irreducible floor

    Xiaocan Li +2

  7. cs.CV 2026-05-19 reviewed
    Bigger 3D models trained on 50M driving scenes top Waymo leaderboard

    STELLAR: Scaling 3D Perception Large Models for Autonomous Driving

    Yingwei Li +15

  8. cs.LG 2026-05-19 reviewed
    Integral operators gain from longer windows in fMRI tasks

    Nonlocal operator learning for fMRI encoding and decoding tasks

    Andreas Kramer +3

  9. cs.CV 2026-05-19 reviewed
    Meta-RL extracts rules to segment concepts at any reasoning level

    ConceptSeg-R1: Segment Any Concept via Meta-Reinforcement Learning

    Yuan Zhao +12

  10. cs.CL 2026-05-19 reviewed
    LLMs switch from instructions to patterns when history conflicts

    Do as I Say, Not as I Do: Instruction-Induction Conflict in LLMs

    Carolina Camassa +1

  11. cs.RO 2026-05-19 reviewed
    Human videos scale humanoid loco-manipulation without custom rewards

    SUGAR: A Scalable Human-Video-Driven Generalizable Humanoid Loco-Manipulation Learning Framework

    Tianshu Wu +7

  12. cs.CV 2026-05-19 reviewed
    Distortion in latent space guides better sampling for missing modalities

    Latent Space Guided Scenario Sampling for Multimodal Segmentation Under Missing Modalities

    Irem Ulku +2

  13. cs.CL 2026-05-19 reviewed
    DEL raises LLM number prediction accuracy on math benchmarks

    DEL: Digit Entropy Loss for Numerical Learning of Large Language Models

    Zhaohui Zheng +5

  14. cs.CR 2026-05-19 reviewed
    Local model classifies security documents at 95 percent accuracy

    Security Document Classification with a Fine-Tuned Local Large Language Model: Benchmark Data and an Open-Source System

    Ivan Dobrovolskyi

  15. cs.LG 2026-05-19 reviewed
    Per-sample temperatures make teacher soft labels consistent

    Consistently Informative Soft-Label Temperature for Knowledge Distillation

    Hoang-Chau Luong +3

  16. cs.CL 2026-05-19 reviewed
    AI dialogue models sync states and predict turns ahead

    Synchronization and Turn-Taking in Full-Duplex Speech Dialogue Models

    Pablo Riera +4

  17. q-fin.CP 2026-05-19 reviewed
    Memory lets RL agents beat competitive benchmarks in trade execution

    Memory-Induced Supra-Competitive Outcomes Between Deep Reinforcement Learning Agents in Optimal Trade Execution

    Christos Spyridon Koulouris +1

  18. cs.LG 2026-05-19 reviewed
    Krylov approximation unlearns data 48x faster than retraining

    Causal Unlearning in Collaborative Optimization: Exact and Approximate Influence Reversal under Adversarial Contributions

    Ali Mahdavi +3

  19. cond-mat.stat-mech 2026-05-19 reviewed
    Target-SAT triples solvable size for hardest random 3-SAT

    Targeting Clause Type Distributions: a Picklock for Random Satisfiability Problems

    J. Schwardt +1

  20. cond-mat.str-el 2026-05-19 reviewed
    NN variational 2-RDM reaches 0.1 meV below exact energy for Chern insulator

    Representability-Aware Neural Networks for Reduced Density Matrices: Application to Fractional Chern Insulators

    Justin B. Hart +6

  21. cs.CV 2026-05-19 reviewed
    LoRA upgrade turns text-to-image flows bidirectional

    FullFlow: Upgrading Text-to-Image Flow Matching Models for Bidirectional Vision--Language Generation

    Eric Tillmann Bill +3

  22. cs.LG 2026-05-19 reviewed
    EEG microstates from one clustering step outperform traditional features on multiple tasks

    Atoms of Thought: Universal EEG Representation Learning with Microstates

    Xinyang Tian +5

  23. cs.AI 2026-05-19 reviewed
    Four-part SDB contract organizes LLM agent runtimes

    A Methodology for Selecting and Composing Runtime Architecture Patterns for Production LLM Agents

    Vasundra Srinivasan

  24. cs.LO 2026-05-19 reviewed
    ASP Automates Long-Term Power Grid Planning

    Long-term Power Grid Planning via Answer Set Programming

    Antonio Ielo +5

  25. cs.AI 2026-05-19 reviewed
    ML ensemble forecasts haor floods 72 hours ahead with 89.6% accuracy

    HaorFloodAlert: Deseasonalized ML Ensemble for 72-Hour Flood Prediction in Bangladesh Haor Wetlands

    Salma Hoque Talukdar Koli +3

  26. cs.AI 2026-05-19 reviewed
    Adapting rubric weights speeds RL training by up to 4x

    Not Every Rubric Teaches Equally: Policy-Aware Rubric Rewards for RLVR

    Utkarsh Tyagi +7

  27. cs.CV 2026-05-19 reviewed
    Counterfactual tests expose failures in LVLM attribution for chest X-rays

    Rethinking Visual Attribution for Chest X-ray Reasoning in Large Vision Language Models

    Guangzhi Xiong +4

  28. cs.CL 2026-05-19 reviewed
    Checklist prompts score 7.5 out of 8 on LLM quality rubric

    Less Back-and-Forth: A Comparative Study of Structured Prompting

    Saurav Ghosh +2

  29. cs.LG 2026-05-19 reviewed
    Repeating smaller datasets speeds up training

    Less Data, Faster Training: repeating smaller datasets speeds up learning via sampling biases

    Jingwen Liu +3

  30. q-bio.NC 2026-05-19 reviewed
    Recovery profiles reveal brain dimensions models miss despite high accuracy

    Beyond Prediction Accuracy: Target-Space Recovery Profiles for Evaluating Model-Brain Alignment

    Ken Nakamura +4

  31. cs.AI 2026-05-19 reviewed
    AI verifies local lemmas for Grasshopper problem but leaves global count unresolved

    Using Aristotle API for AI-Assisted Theorem Proving in Lean 4: A Formalisation Case Study of the Grasshopper Problem

    Gabriel Rongyang Lau

  32. cs.LG 2026-05-19 reviewed
    Single recipe scales time series models from 4M to 2.5B parameters

    Toto 2.0: Time Series Forecasting Enters the Scaling Era

    Emaad Khwaja +12

  33. eess.SY 2026-05-19 reviewed
    Single trajectory yields neural k-inductive barriers for unknown dynamics

    k-Inductive Neural Barrier Certificates for Unknown Nonlinear Dynamics

    Ben Wooding +3

  34. cs.LG 2026-05-19 reviewed
    AutoML for health risk prediction reduces to few key components

    A Reproducible Log-Driven AutoML Framework for Interpretable Pipeline Optimization in Healthcare Risk Prediction

    Rui Huang +1

  35. cs.LG 2026-05-19 reviewed
    No fixed marginal covariance is safe for all geometries in JEPAs

    Beyond Isotropy in JEPAs: Hamiltonian Geometry and Symplectic Prediction

    Robert Jenkinson Alvarez

  36. cs.LG 2026-05-19 reviewed
    Pruning plus retrieval yields up to 5.41× speculative decoding speedups

    Draft Less, Retrieve More: Hybrid Tree Construction for Speculative Decoding

    Yuhao Shen +11

  37. cs.AI 2026-05-19 reviewed
    Argumentation rules turn LLM outputs into faithful ternary claim verdicts

    Neurosymbolic Learning for Inference-Time Argumentation

    Gabriel Freedman +6

  38. cs.LG 2026-05-19 reviewed
    Per-instance shapelets beat population averages on time-series tasks

    INSHAPE: Instance-Level Shapelets for Interpretable Time-Series Classification

    Seongjun Lee +2

  39. cs.CL 2026-05-19 reviewed
    Dataset pairs LLM chats with users' reported thoughts

    ThoughtTrace: Understanding User Thoughts in Real-World LLM Interactions

    Chuanyang Jin +8

    5 Piths
  40. cs.CL 2026-05-19 reviewed
    Thoughts collected with LLM chats improve behavior forecasts

    ThoughtTrace: Understanding User Thoughts in Real-World LLM Interactions

    Chuanyang Jin +8

    5 Piths
  41. cs.NE 2026-05-19 reviewed
    Evolutionary code agents gain by recycling deleted lines

    What Do Evolutionary Coding Agents Evolve?

    Nico Pelleriti +6

  42. cs.CL 2026-05-19 reviewed
    Joint lattice testing calibrates cascaded RAG thresholds at target risk

    BalanceRAG: Joint Risk Calibration for Cascaded Retrieval-Augmented Generation

    Zijun Jia +8

  43. cs.CV 2026-05-19 reviewed
    VLM-guided DPO lifts driving model human alignment by 12%

    VL-DPO: Vision-Language-Guided Finetuning for Preference-Aligned Autonomous Driving

    Zhefan Xu +5

  44. cs.CV 2026-05-19 reviewed
    Adaptive Manifold Guidance conserves probability during strong guidance

    Probability-Conserving Flow Guidance

    Parsa Esmati +4

  45. cs.CL 2026-05-19 reviewed
    Draft answer first then reflect to gain 23% accuracy with 57% fewer tokens

    CopT: Contrastive On-Policy Thinking with Continuous Spaces for General and Agentic Reasoning

    Dachuan Shi +6

  46. cs.CV 2026-05-19 reviewed
    Small tables bind new visual concepts to word triggers

    Tiny-Engram: Trigger-Indexed Concept Tables for Generative Vision

    Runyuan Cai +3

  47. cs.AI 2026-05-19 reviewed
    Moderate noise raises LLM agent success 2.85-fold on puzzle task

    Probing Embodied LLMs: When Higher Observation Fidelity Hurts Problem Solving

    Oussama Zenkri +1

  48. cs.SE 2026-05-19 reviewed
    Staged analysis improves LLM recovery of ROS 2 architectures

    Towards LLM-Assisted Architecture Recovery for Real-World ROS~2 Systems: An Agent-Based Multi-Level Approach to Hierarchical Structural Architecture Reconstruction

    Dominique Briechle +7

  49. cs.CV 2026-05-19 reviewed
    SDM improves adversarial attack performance and efficiency by reconstructing the…

    SDM: A Powerful Tool for Evaluating Model Robustness

    Xinlei Liu +5

  50. cs.CL 2026-05-19 reviewed
    Prompt tuning labels radiology reports with 32 examples

    PromptRad: Knowledge-Enhanced Multi-Label Prompt-Tuning for Low-Resource Radiology Report Labeling

    Ying-Jia Lin +5