pith. sign in

archive

Every paper Pith has read. Search by title, abstract, or pith.

14513 papers in cs.AI · page 7

  1. cs.LG 2026-05-21 reviewed
    Medical world model cuts kidney disease forecast error by 7%

    ChronoMedicalWorld: A Medical World Model for Learning Patient Trajectories from Longitudinal Care Data

    Jiangyuan Wang +5

  2. cs.AI 2026-05-21 reviewed
    AI gives serious games real-time adaptive training

    AI-Enabled Serious Games: Integrating Intelligence and Adaptivity in Training Systems

    Priyamvada Tripathi +1

  3. cs.CV 2026-05-21 reviewed
    MLLMs spot correct video timing in prefill but forget during answers

    MLLMs Know When Before Speaking: Revealing and Recovering Temporal Grounding via Attention Cues

    Dazhao Du +7

  4. cond-mat.stat-mech 2026-05-21 reviewed
    Irreversibility equates four measures and picks low-entropy paths

    Thermodynamic Irreversibility of Training Algorithms

    Liu Ziyin +3

  5. cs.LG 2026-05-21 reviewed
    CausalGuard weights candidate graphs for covered causal effect estimates

    CausalGuard: Conformal Inference under Graph Uncertainty

    Vikash Singh +14

  6. cs.CV 2026-05-21 reviewed
    VLMs favor SDG priors over evidence on 550k-task benchmark

    SDGBiasBench: Benchmarking and Mitigating Vision--Language Models' Biases in Sustainable Development Goals

    Zihang Lin +3

  7. cs.CV 2026-05-21 reviewed
    MAVEN pipeline annotates 5300 videos so 8B VLM beats Gemini on CCTV reasoning

    MAVEN: A Multi-stage Agentic Annotation Pipeline for Video Reasoning Tasks

    Han Zhang +4

  8. eess.SY 2026-05-21 reviewed
    Physics laws inside neural nets speed up power-grid modeling

    Engineering Hybrid Physics-Informed Neural Networks for Next-Generation Electricity Systems: A State-of-the-Art Review

    Joseph Nyangon

  9. cs.AI 2026-05-21 reviewed
    LLMs now build planners instead of one-off plans

    Planning in the LLM Era: Building for Reliability and Efficiency

    Michael Katz +3

  10. cs.AI 2026-05-21 reviewed
    7B model beats larger ones at Lean proof optimization

    ImProver 2: Iteratively Self-Improving LMs for Neurosymbolic Proof Optimization

    Riyaz Ahuja +3

  11. cs.CV 2026-05-21 reviewed
    Staged fusion of text audio vision reaches 0.47 emotion correlation

    Two-Stage Multimodal Framework for Emotion Mimicry Intensity Prediction

    Dinithi Dissanayake +4

  12. cs.RO 2026-05-21 reviewed
    Action-updated scene prior lifts robot task success

    EvoScene-VLA: Evolving Scene Beliefs Inside the Action Decoder for Chunked Robot Control

    Chushan Zhang +5

  13. cs.CV 2026-05-21 reviewed
    Modular experts resolve gradient conflicts in multi-modal medical pretraining

    Learning Emergent Modular Representations in Multi-modality Medical Vision Foundation Models

    Yuting He +2

  14. cs.LG 2026-05-21 reviewed
    Truncating CoT exposes evasive contamination in LLMs

    The Illusion of Reasoning: Exposing Evasive Data Contamination in LLMs via Zero-CoT Truncation

    Yifan Lan +4

  15. cs.CV 2026-05-21 reviewed
    DoRA raises VLA success rates by 10.4 points over SFT

    CrossVLA: Cross-Paradigm Post-Training and Inference Optimization for Vision-Language-Action Models

    Zhi Liu

  16. cs.LG 2026-05-21 reviewed
    Accumulating oracle signals yields token-level advantages in one pass

    OPPO: Bayesian Value Recursion for Token-Level Credit Assignment in LLM Reasoning

    Yu Li +3

  17. cs.LG 2026-05-21 reviewed
    Accumulating oracle signals yields token-level advantages for LLMs

    OPPO: Bayesian Value Recursion for Token-Level Credit Assignment in LLM Reasoning

    Yu Li +3

  18. cs.CL 2026-05-21 reviewed
    Agent trajectories compiled into QA pairs improve long-context performance

    ACC: Compiling Agent Trajectories for Long-Context Training

    Qisheng Su +10

  19. cs.CL 2026-05-21 reviewed
    LLMs beat fine-tuned models on rare suicide circumstances

    Comparing LLM and Fine-Tuned Model Performance on NVDRS Circumstance Extraction with Varying Prompt Complexity

    Geoffrey Martin +2

  20. cs.LG 2026-05-21 reviewed
    Tensor Cache stores evicted tokens in outer-product memory

    Tensor Cache: Eviction-conditioned Associative Memory for Transformers

    Kabir Swain +4

  21. eess.IV 2026-05-20 reviewed
    PET/CT model matches full segmentation accuracy with 10% labels

    An Open Multi-Center Whole-Body FDG PET/CT Foundation Model for Tumor Segmentation

    Xiaofeng Liu +6

  22. cs.AI 2026-05-20 reviewed
    Multimodal codes replace IDs in livestream recs

    FLUID: From Ephemeral IDs to Multimodal Semantic Codes for Industrial-Scale Livestreaming Recommendation

    Xinhang Yuan +8

  23. cs.CL 2026-05-20 reviewed
    LLMs reduce ten intensity words to five numeric values

    Does Slightly Mean Somewhat? Measuring Vague Intensity Words in LLM Numeric Actions

    Daniel Tabach (Georgia Institute of Technology)

  24. cs.AI 2026-05-20 reviewed
    AI agents autonomously build custom visualization apps from data

    Toward AI VIS Co-Scientists: A General and End-to-End Agent Harness for Solving Complex Data Visualization Tasks

    Haichao Miao +6

  25. cs.AI 2026-05-20 reviewed
    Crowd preferences yield reusable safety skills for RL tasks

    Implicit Safety Alignment from Crowd Preferences

    Qian Lin +1

  26. cs.AI 2026-05-20 reviewed
    Evolved skills from traces solve hard Verilog tasks

    Trace2Skill: Verifier-Guided Skill Evolution for Long-Context EDA Agents

    Zijian Du +1

  27. cs.AI 2026-05-20 reviewed
    Agentic AI uses 4.33x more energy per successful goal than linear baselines

    Energy per Successful Goal: Goal-Level Energy Accounting for Agentic AI Systems

    Deepak Panigrahy +1

  28. cs.CL 2026-05-20 reviewed
    DivSkill-SQL lifts Text-to-SQL accuracy by up to 11 points

    Residual Skill Optimization for Text-to-SQL Ensembles

    Jiongli Zhu +10

  29. hep-ex 2026-05-20 reviewed
    Patch attention model tags LHC jets accurately under tight budgets

    Patch Hierarchical Attention Transformer for Efficient Particle Jet Tagging

    Aaron Wang +7

  30. cs.AI 2026-05-20 reviewed
    Experts disagree on which AI behaviors count as sycophancy

    What Counts as AI Sycophancy? A Taxonomy and Expert Survey of a Fragmented Construct

    Meryl Ye +7

  31. cs.HC 2026-05-20 reviewed
    Trust drives acceptance of collaborative decision tech in pediatrics

    Understanding Perspectives of Patients, Caregivers and Clinicians towards Emerging Collaborative-decision Making Technologies

    Ray-Yuan Chung +9

  32. cs.AI 2026-05-20 reviewed
    Causal links turned into arguments explain ML predictions

    A Causal Argumentation Method for Explainability of Machine Learning Models

    Henry Salgado +2

  33. cs.LG 2026-05-20 reviewed
    Pairwise comparisons yield unbiased preference percentiles

    PEARL: Unbiased Percentile Estimation via Contrastive Learning for Industrial-Scale Livestream Recommendation

    Blake Gella +8

  34. cs.AI 2026-05-20 reviewed
    Platform choice alters AI employment impact estimates by factor of 1.9

    Who Uses AI? Platform Selection and the Measurement of Occupational AI Exposure

    Michelle Yin +1

  35. cs.AI 2026-05-20 reviewed
    Best LLM solves only 40% of drug design tasks

    SMDD-Bench: Can LLMs Solve Real-World Small Molecule Drug Design Tasks?

    Kevin Han +5

  36. cs.AI 2026-05-20 reviewed
    LLM emotional skills prove independent in real chats

    AttuneBench: A Conversation-Based Benchmark for LLM Emotional Intelligence

    Kate M. Lubrano +6

  37. stat.ML 2026-05-20 reviewed
    Support-aware method certifies ad reserve policies from logs

    Support-aware offline policy selection for advertising marketplaces

    Prashant Shekhar +1

  38. cs.CL 2026-05-20 reviewed
    Bayes rule gives LLMs token-by-token attribution scores

    Probabilistic Attribution For Large Language Models

    Shilpika Shilpika +4

  39. cs.LG 2026-05-20 reviewed
    Exact doubly stochastic mixes via transportation polytopes

    TBP-mHC: full expressivity for manifold-constrained hyper connections through transportation polytopes

    Anton Lyubinin

  40. cs.RO 2026-05-20 reviewed
    GNN approximates altruistic robot transfers for scaling teams

    Learning Altruistic Collaboration in Heterogeneous Multi-Team Systems

    Riwa Karam +3

  41. cs.AI 2026-05-20 reviewed
    Pushing past refusal boundary boosts jailbreak success

    Latent-space Attacks for Refusal Evasion in Language Models

    Giorgio Piras +6

  42. cs.AI 2026-05-20 reviewed
    Heavy AI use weakens reasoning skills after help ends

    The Impact of AI Usage and Informativeness on Skill Development in Logical Reasoning

    Shang Wu +5

  43. cs.CR 2026-05-20 reviewed
    Typed boundaries make LLM defense measurable and attributable

    PocketAgents: A Manifest-Driven Library of Autonomous Defense Agents

    Sidnei Barbieri +2

  44. cs.AI 2026-05-20 reviewed
    AI models classify words as vehicles and vegetables as fruit

    Investigating Concept Alignment Using Implausible Category Members

    Sunayana Rane +2

  45. cs.CL 2026-05-20 reviewed
    Open-source LLMs lean left on politics

    How Far Will They Go? Red-Teaming Online Influence with Large Language Models

    Daniel C. Ruiz +4

  46. cs.CV 2026-05-20 reviewed
    AI turns T1 scans into motion-free high-res MRIs

    MRecover: A Conditional Generative Model for Recovering Motion-Corrupted MR images Using AI Generated Contrast

    Jinghang Li +15

  47. cs.MA 2026-05-20 reviewed
    EV charging models face fidelity tradeoffs across three layers

    Planning, Scheduling, and Behavior in EV Charging Systems: A Critical Survey and Trilemma Framework

    Peiyan Xiao +5

  48. cs.LG 2026-05-20 reviewed
    Stochastic policy amortizes diffusion guidance for 5x faster sampling

    Hierarchical Variational Policies for Reward-Guided Diffusion

    Kushagra Pandey +4

  49. cs.LG 2026-05-20 reviewed
    Actor updates match value gradients under differentiable rollouts

    Value-Gradient Hypothesis of RL for LLMs

    Arip Asadulaev +3

  50. cs.LG 2026-05-20 reviewed
    Fine-tuned detectors amplify a pretrained typicality axis

    Amplifying, Not Learning: Fine-Tuned AI Text Detectors Amplify a Pretrained Direction

    Alexander Smirnov