archive

Every paper Pith has read. Search by title, abstract, or pith.

14513 papers in cs.AI · page 7

cs.LG 2026-05-21 reviewed

Medical world model cuts kidney disease forecast error by 7%
ChronoMedicalWorld: A Medical World Model for Learning Patient Trajectories from Longitudinal Care Data

Jiangyuan Wang +5
cs.AI 2026-05-21 reviewed

AI gives serious games real-time adaptive training
AI-Enabled Serious Games: Integrating Intelligence and Adaptivity in Training Systems

Priyamvada Tripathi +1
cs.CV 2026-05-21 reviewed

MLLMs spot correct video timing in prefill but forget during answers
MLLMs Know When Before Speaking: Revealing and Recovering Temporal Grounding via Attention Cues

Dazhao Du +7
cond-mat.stat-mech 2026-05-21 reviewed

Irreversibility equates four measures and picks low-entropy paths
Thermodynamic Irreversibility of Training Algorithms

Liu Ziyin +3
cs.LG 2026-05-21 reviewed

CausalGuard weights candidate graphs for covered causal effect estimates
CausalGuard: Conformal Inference under Graph Uncertainty

Vikash Singh +14
cs.CV 2026-05-21 reviewed

VLMs favor SDG priors over evidence on 550k-task benchmark
SDGBiasBench: Benchmarking and Mitigating Vision--Language Models' Biases in Sustainable Development Goals

Zihang Lin +3
cs.CV 2026-05-21 reviewed

MAVEN pipeline annotates 5300 videos so 8B VLM beats Gemini on CCTV reasoning
MAVEN: A Multi-stage Agentic Annotation Pipeline for Video Reasoning Tasks

Han Zhang +4
eess.SY 2026-05-21 reviewed

Physics laws inside neural nets speed up power-grid modeling
Engineering Hybrid Physics-Informed Neural Networks for Next-Generation Electricity Systems: A State-of-the-Art Review

Joseph Nyangon
cs.AI 2026-05-21 reviewed

LLMs now build planners instead of one-off plans
Planning in the LLM Era: Building for Reliability and Efficiency

Michael Katz +3
cs.AI 2026-05-21 reviewed

7B model beats larger ones at Lean proof optimization
ImProver 2: Iteratively Self-Improving LMs for Neurosymbolic Proof Optimization

Riyaz Ahuja +3
cs.CV 2026-05-21 reviewed

Staged fusion of text audio vision reaches 0.47 emotion correlation
Two-Stage Multimodal Framework for Emotion Mimicry Intensity Prediction

Dinithi Dissanayake +4
cs.RO 2026-05-21 reviewed

Action-updated scene prior lifts robot task success
EvoScene-VLA: Evolving Scene Beliefs Inside the Action Decoder for Chunked Robot Control

Chushan Zhang +5
cs.CV 2026-05-21 reviewed

Modular experts resolve gradient conflicts in multi-modal medical pretraining
Learning Emergent Modular Representations in Multi-modality Medical Vision Foundation Models

Yuting He +2
cs.LG 2026-05-21 reviewed

Truncating CoT exposes evasive contamination in LLMs
The Illusion of Reasoning: Exposing Evasive Data Contamination in LLMs via Zero-CoT Truncation

Yifan Lan +4
cs.CV 2026-05-21 reviewed

DoRA raises VLA success rates by 10.4 points over SFT
CrossVLA: Cross-Paradigm Post-Training and Inference Optimization for Vision-Language-Action Models

Zhi Liu
cs.LG 2026-05-21 reviewed

Accumulating oracle signals yields token-level advantages in one pass
OPPO: Bayesian Value Recursion for Token-Level Credit Assignment in LLM Reasoning

Yu Li +3
cs.LG 2026-05-21 reviewed

Accumulating oracle signals yields token-level advantages for LLMs
OPPO: Bayesian Value Recursion for Token-Level Credit Assignment in LLM Reasoning

Yu Li +3
cs.CL 2026-05-21 reviewed

Agent trajectories compiled into QA pairs improve long-context performance
ACC: Compiling Agent Trajectories for Long-Context Training

Qisheng Su +10
cs.CL 2026-05-21 reviewed

LLMs beat fine-tuned models on rare suicide circumstances
Comparing LLM and Fine-Tuned Model Performance on NVDRS Circumstance Extraction with Varying Prompt Complexity

Geoffrey Martin +2
cs.LG 2026-05-21 reviewed

Tensor Cache stores evicted tokens in outer-product memory
Tensor Cache: Eviction-conditioned Associative Memory for Transformers

Kabir Swain +4
eess.IV 2026-05-20 reviewed

PET/CT model matches full segmentation accuracy with 10% labels
An Open Multi-Center Whole-Body FDG PET/CT Foundation Model for Tumor Segmentation

Xiaofeng Liu +6
cs.AI 2026-05-20 reviewed

Multimodal codes replace IDs in livestream recs
FLUID: From Ephemeral IDs to Multimodal Semantic Codes for Industrial-Scale Livestreaming Recommendation

Xinhang Yuan +8
cs.CL 2026-05-20 reviewed

LLMs reduce ten intensity words to five numeric values
Does Slightly Mean Somewhat? Measuring Vague Intensity Words in LLM Numeric Actions

Daniel Tabach (Georgia Institute of Technology)
cs.AI 2026-05-20 reviewed

AI agents autonomously build custom visualization apps from data
Toward AI VIS Co-Scientists: A General and End-to-End Agent Harness for Solving Complex Data Visualization Tasks

Haichao Miao +6
cs.AI 2026-05-20 reviewed

Crowd preferences yield reusable safety skills for RL tasks
Implicit Safety Alignment from Crowd Preferences

Qian Lin +1
cs.AI 2026-05-20 reviewed

Evolved skills from traces solve hard Verilog tasks
Trace2Skill: Verifier-Guided Skill Evolution for Long-Context EDA Agents

Zijian Du +1
cs.AI 2026-05-20 reviewed

Agentic AI uses 4.33x more energy per successful goal than linear baselines
Energy per Successful Goal: Goal-Level Energy Accounting for Agentic AI Systems

Deepak Panigrahy +1
cs.CL 2026-05-20 reviewed

DivSkill-SQL lifts Text-to-SQL accuracy by up to 11 points
Residual Skill Optimization for Text-to-SQL Ensembles

Jiongli Zhu +10
hep-ex 2026-05-20 reviewed

Patch attention model tags LHC jets accurately under tight budgets
Patch Hierarchical Attention Transformer for Efficient Particle Jet Tagging

Aaron Wang +7
cs.AI 2026-05-20 reviewed

Experts disagree on which AI behaviors count as sycophancy
What Counts as AI Sycophancy? A Taxonomy and Expert Survey of a Fragmented Construct

Meryl Ye +7
cs.HC 2026-05-20 reviewed

Trust drives acceptance of collaborative decision tech in pediatrics
Understanding Perspectives of Patients, Caregivers and Clinicians towards Emerging Collaborative-decision Making Technologies

Ray-Yuan Chung +9
cs.AI 2026-05-20 reviewed

Causal links turned into arguments explain ML predictions
A Causal Argumentation Method for Explainability of Machine Learning Models

Henry Salgado +2
cs.LG 2026-05-20 reviewed

Pairwise comparisons yield unbiased preference percentiles
PEARL: Unbiased Percentile Estimation via Contrastive Learning for Industrial-Scale Livestream Recommendation

Blake Gella +8
cs.AI 2026-05-20 reviewed

Platform choice alters AI employment impact estimates by factor of 1.9
Who Uses AI? Platform Selection and the Measurement of Occupational AI Exposure

Michelle Yin +1
cs.AI 2026-05-20 reviewed

Best LLM solves only 40% of drug design tasks
SMDD-Bench: Can LLMs Solve Real-World Small Molecule Drug Design Tasks?

Kevin Han +5
cs.AI 2026-05-20 reviewed

LLM emotional skills prove independent in real chats
AttuneBench: A Conversation-Based Benchmark for LLM Emotional Intelligence

Kate M. Lubrano +6
stat.ML 2026-05-20 reviewed

Support-aware method certifies ad reserve policies from logs
Support-aware offline policy selection for advertising marketplaces

Prashant Shekhar +1
cs.CL 2026-05-20 reviewed

Bayes rule gives LLMs token-by-token attribution scores
Probabilistic Attribution For Large Language Models

Shilpika Shilpika +4
cs.LG 2026-05-20 reviewed

Exact doubly stochastic mixes via transportation polytopes
TBP-mHC: full expressivity for manifold-constrained hyper connections through transportation polytopes

Anton Lyubinin
cs.RO 2026-05-20 reviewed

GNN approximates altruistic robot transfers for scaling teams
Learning Altruistic Collaboration in Heterogeneous Multi-Team Systems

Riwa Karam +3
cs.AI 2026-05-20 reviewed

Pushing past refusal boundary boosts jailbreak success
Latent-space Attacks for Refusal Evasion in Language Models

Giorgio Piras +6
cs.AI 2026-05-20 reviewed

Heavy AI use weakens reasoning skills after help ends
The Impact of AI Usage and Informativeness on Skill Development in Logical Reasoning

Shang Wu +5
cs.CR 2026-05-20 reviewed

Typed boundaries make LLM defense measurable and attributable
PocketAgents: A Manifest-Driven Library of Autonomous Defense Agents

Sidnei Barbieri +2
cs.AI 2026-05-20 reviewed

AI models classify words as vehicles and vegetables as fruit
Investigating Concept Alignment Using Implausible Category Members

Sunayana Rane +2
cs.CL 2026-05-20 reviewed

Open-source LLMs lean left on politics
How Far Will They Go? Red-Teaming Online Influence with Large Language Models

Daniel C. Ruiz +4
cs.CV 2026-05-20 reviewed

AI turns T1 scans into motion-free high-res MRIs
MRecover: A Conditional Generative Model for Recovering Motion-Corrupted MR images Using AI Generated Contrast

Jinghang Li +15
cs.MA 2026-05-20 reviewed

EV charging models face fidelity tradeoffs across three layers
Planning, Scheduling, and Behavior in EV Charging Systems: A Critical Survey and Trilemma Framework

Peiyan Xiao +5
cs.LG 2026-05-20 reviewed

Stochastic policy amortizes diffusion guidance for 5x faster sampling
Hierarchical Variational Policies for Reward-Guided Diffusion

Kushagra Pandey +4
cs.LG 2026-05-20 reviewed

Actor updates match value gradients under differentiable rollouts
Value-Gradient Hypothesis of RL for LLMs

Arip Asadulaev +3
cs.LG 2026-05-20 reviewed

Fine-tuned detectors amplify a pretrained typicality axis
Amplifying, Not Learning: Fine-Tuned AI Text Detectors Amplify a Pretrained Direction

Alexander Smirnov