archive

Every paper Pith has read. Search by title, abstract, or pith.

14513 papers in cs.AI · page 12

cs.AI 2026-05-19 reviewed

AgentCo-op links existing agents into genomics workflows without redesign
AgentCo-op: Retrieval-Based Synthesis of Interoperable Multi-Agent Workflows

Shuaike Shen +4
cs.AI 2026-05-19 reviewed

RL method raises ToM accuracy from 0.2% to 76% on asymmetric tasks
OSCToM: RL-Guided Adversarial Generation for High-Order Theory of Mind

Sharmin Sultana Srishty +4
cs.CL 2026-05-19 reviewed

CoT prompting leaves gender bias inside LLMs
Mechanics of Bias and Reasoning: Interpreting the Impact of Chain-of-Thought Prompting on Gender Bias in LLMs

Edie Pearman +5
eess.IV 2026-05-19 reviewed

This paper tests episodic sampling to build class-balanced batches for CT body…
Disentangling Sampling from Training Budget in Class-Imbalanced CT Body Composition Segmentation

Iason Skylitsis +2
cs.LG 2026-05-19 reviewed

MXFP4 error splits into three parts each fixing a different RL failure
Decomposing MXFP4 quantization error for LLM reinforcement learning: reducible bias, recoverable deadzone, and an irreducible floor

Xiaocan Li +2
cs.LG 2026-05-19 reviewed

MXFP4 error splits into three parts for targeted RL fixes
Decomposing MXFP4 quantization error for LLM reinforcement learning: reducible bias, recoverable deadzone, and an irreducible floor

Xiaocan Li +2
cs.CV 2026-05-19 reviewed

Bigger 3D models trained on 50M driving scenes top Waymo leaderboard
STELLAR: Scaling 3D Perception Large Models for Autonomous Driving

Yingwei Li +15
cs.LG 2026-05-19 reviewed

Integral operators gain from longer windows in fMRI tasks
Nonlocal operator learning for fMRI encoding and decoding tasks

Andreas Kramer +3
cs.CV 2026-05-19 reviewed

Meta-RL extracts rules to segment concepts at any reasoning level
ConceptSeg-R1: Segment Any Concept via Meta-Reinforcement Learning

Yuan Zhao +12
cs.CL 2026-05-19 reviewed

LLMs switch from instructions to patterns when history conflicts
Do as I Say, Not as I Do: Instruction-Induction Conflict in LLMs

Carolina Camassa +1
cs.RO 2026-05-19 reviewed

Human videos scale humanoid loco-manipulation without custom rewards
SUGAR: A Scalable Human-Video-Driven Generalizable Humanoid Loco-Manipulation Learning Framework

Tianshu Wu +7
cs.CV 2026-05-19 reviewed

Distortion in latent space guides better sampling for missing modalities
Latent Space Guided Scenario Sampling for Multimodal Segmentation Under Missing Modalities

Irem Ulku +2
cs.CL 2026-05-19 reviewed

DEL raises LLM number prediction accuracy on math benchmarks
DEL: Digit Entropy Loss for Numerical Learning of Large Language Models

Zhaohui Zheng +5
cs.CR 2026-05-19 reviewed

Local model classifies security documents at 95 percent accuracy
Security Document Classification with a Fine-Tuned Local Large Language Model: Benchmark Data and an Open-Source System

Ivan Dobrovolskyi
cs.LG 2026-05-19 reviewed

Per-sample temperatures make teacher soft labels consistent
Consistently Informative Soft-Label Temperature for Knowledge Distillation

Hoang-Chau Luong +3
cs.CL 2026-05-19 reviewed

AI dialogue models sync states and predict turns ahead
Synchronization and Turn-Taking in Full-Duplex Speech Dialogue Models

Pablo Riera +4
q-fin.CP 2026-05-19 reviewed

Memory lets RL agents beat competitive benchmarks in trade execution
Memory-Induced Supra-Competitive Outcomes Between Deep Reinforcement Learning Agents in Optimal Trade Execution

Christos Spyridon Koulouris +1
cs.LG 2026-05-19 reviewed

Krylov approximation unlearns data 48x faster than retraining
Causal Unlearning in Collaborative Optimization: Exact and Approximate Influence Reversal under Adversarial Contributions

Ali Mahdavi +3
cond-mat.stat-mech 2026-05-19 reviewed

Target-SAT triples solvable size for hardest random 3-SAT
Targeting Clause Type Distributions: a Picklock for Random Satisfiability Problems

J. Schwardt +1
cond-mat.str-el 2026-05-19 reviewed

NN variational 2-RDM reaches 0.1 meV below exact energy for Chern insulator
Representability-Aware Neural Networks for Reduced Density Matrices: Application to Fractional Chern Insulators

Justin B. Hart +6
cs.CV 2026-05-19 reviewed

LoRA upgrade turns text-to-image flows bidirectional
FullFlow: Upgrading Text-to-Image Flow Matching Models for Bidirectional Vision--Language Generation

Eric Tillmann Bill +3
cs.LG 2026-05-19 reviewed

EEG microstates from one clustering step outperform traditional features on multiple tasks
Atoms of Thought: Universal EEG Representation Learning with Microstates

Xinyang Tian +5
cs.AI 2026-05-19 reviewed

Four-part SDB contract organizes LLM agent runtimes
A Methodology for Selecting and Composing Runtime Architecture Patterns for Production LLM Agents

Vasundra Srinivasan
cs.LO 2026-05-19 reviewed

ASP Automates Long-Term Power Grid Planning
Long-term Power Grid Planning via Answer Set Programming

Antonio Ielo +5
cs.AI 2026-05-19 reviewed

ML ensemble forecasts haor floods 72 hours ahead with 89.6% accuracy
HaorFloodAlert: Deseasonalized ML Ensemble for 72-Hour Flood Prediction in Bangladesh Haor Wetlands

Salma Hoque Talukdar Koli +3
cs.AI 2026-05-19 reviewed

Adapting rubric weights speeds RL training by up to 4x
Not Every Rubric Teaches Equally: Policy-Aware Rubric Rewards for RLVR

Utkarsh Tyagi +7
cs.CV 2026-05-19 reviewed

Counterfactual tests expose failures in LVLM attribution for chest X-rays
Rethinking Visual Attribution for Chest X-ray Reasoning in Large Vision Language Models

Guangzhi Xiong +4
cs.CL 2026-05-19 reviewed

Checklist prompts score 7.5 out of 8 on LLM quality rubric
Less Back-and-Forth: A Comparative Study of Structured Prompting

Saurav Ghosh +2
cs.LG 2026-05-19 reviewed

Repeating smaller datasets speeds up training
Less Data, Faster Training: repeating smaller datasets speeds up learning via sampling biases

Jingwen Liu +3
q-bio.NC 2026-05-19 reviewed

Recovery profiles reveal brain dimensions models miss despite high accuracy
Beyond Prediction Accuracy: Target-Space Recovery Profiles for Evaluating Model-Brain Alignment

Ken Nakamura +4
cs.AI 2026-05-19 reviewed

AI verifies local lemmas for Grasshopper problem but leaves global count unresolved
Using Aristotle API for AI-Assisted Theorem Proving in Lean 4: A Formalisation Case Study of the Grasshopper Problem

Gabriel Rongyang Lau
cs.LG 2026-05-19 reviewed

Single recipe scales time series models from 4M to 2.5B parameters
Toto 2.0: Time Series Forecasting Enters the Scaling Era

Emaad Khwaja +12
eess.SY 2026-05-19 reviewed

Single trajectory yields neural k-inductive barriers for unknown dynamics
k-Inductive Neural Barrier Certificates for Unknown Nonlinear Dynamics

Ben Wooding +3
cs.LG 2026-05-19 reviewed

AutoML for health risk prediction reduces to few key components
A Reproducible Log-Driven AutoML Framework for Interpretable Pipeline Optimization in Healthcare Risk Prediction

Rui Huang +1
cs.LG 2026-05-19 reviewed

No fixed marginal covariance is safe for all geometries in JEPAs
Beyond Isotropy in JEPAs: Hamiltonian Geometry and Symplectic Prediction

Robert Jenkinson Alvarez
cs.LG 2026-05-19 reviewed

Pruning plus retrieval yields up to 5.41× speculative decoding speedups
Draft Less, Retrieve More: Hybrid Tree Construction for Speculative Decoding

Yuhao Shen +11
cs.AI 2026-05-19 reviewed

Argumentation rules turn LLM outputs into faithful ternary claim verdicts
Neurosymbolic Learning for Inference-Time Argumentation

Gabriel Freedman +6
cs.LG 2026-05-19 reviewed

Per-instance shapelets beat population averages on time-series tasks
INSHAPE: Instance-Level Shapelets for Interpretable Time-Series Classification

Seongjun Lee +2
cs.CL 2026-05-19 reviewed

Dataset pairs LLM chats with users' reported thoughts
ThoughtTrace: Understanding User Thoughts in Real-World LLM Interactions

Chuanyang Jin +8

5 Piths
cs.CL 2026-05-19 reviewed

Thoughts collected with LLM chats improve behavior forecasts
ThoughtTrace: Understanding User Thoughts in Real-World LLM Interactions

Chuanyang Jin +8

5 Piths
cs.NE 2026-05-19 reviewed

Evolutionary code agents gain by recycling deleted lines
What Do Evolutionary Coding Agents Evolve?

Nico Pelleriti +6
cs.CL 2026-05-19 reviewed

Joint lattice testing calibrates cascaded RAG thresholds at target risk
BalanceRAG: Joint Risk Calibration for Cascaded Retrieval-Augmented Generation

Zijun Jia +8
cs.CV 2026-05-19 reviewed

VLM-guided DPO lifts driving model human alignment by 12%
VL-DPO: Vision-Language-Guided Finetuning for Preference-Aligned Autonomous Driving

Zhefan Xu +5
cs.CV 2026-05-19 reviewed

Adaptive Manifold Guidance conserves probability during strong guidance
Probability-Conserving Flow Guidance

Parsa Esmati +4
cs.CL 2026-05-19 reviewed

Draft answer first then reflect to gain 23% accuracy with 57% fewer tokens
CopT: Contrastive On-Policy Thinking with Continuous Spaces for General and Agentic Reasoning

Dachuan Shi +6
cs.CV 2026-05-19 reviewed

Small tables bind new visual concepts to word triggers
Tiny-Engram: Trigger-Indexed Concept Tables for Generative Vision

Runyuan Cai +3
cs.AI 2026-05-19 reviewed

Moderate noise raises LLM agent success 2.85-fold on puzzle task
Probing Embodied LLMs: When Higher Observation Fidelity Hurts Problem Solving

Oussama Zenkri +1
cs.SE 2026-05-19 reviewed

Staged analysis improves LLM recovery of ROS 2 architectures
Towards LLM-Assisted Architecture Recovery for Real-World ROS~2 Systems: An Agent-Based Multi-Level Approach to Hierarchical Structural Architecture Reconstruction

Dominique Briechle +7
cs.CV 2026-05-19 reviewed

SDM improves adversarial attack performance and efficiency by reconstructing the…
SDM: A Powerful Tool for Evaluating Model Robustness

Xinlei Liu +5
cs.CL 2026-05-19 reviewed

Prompt tuning labels radiology reports with 32 examples
PromptRad: Knowledge-Enhanced Multi-Label Prompt-Tuning for Low-Resource Radiology Report Labeling

Ying-Jia Lin +5