archive

Every paper Pith has read. Search by title, abstract, or pith.

14513 papers in cs.AI · page 13

cs.CL 2026-05-19 reviewed

Prompt tuning with UMLS synonyms labels reports from 32 examples
PromptRad: Knowledge-Enhanced Multi-Label Prompt-Tuning for Low-Resource Radiology Report Labeling

Ying-Jia Lin +5
cs.SE 2026-05-19 reviewed

Cleaner code reduces agent token use by 7-8% with no change in success
Does Code Cleanliness Affect Coding Agents? A Controlled Minimal-Pair Study

Priyansh Trivedi +1
cs.LG 2026-05-19 reviewed

Critic disagreement guides reward poisoning in RIS networks
When Critics Disagree: Adaptive Reward Poisoning Attacks in RIS-Aided Wireless Control System

Deemah H. Tashman +1
cs.AI 2026-05-19 reviewed

Multi-agent system improves autonomous research by 54.7 percent
AutoResearchClaw: Self-Reinforcing Autonomous Research with Human-AI Collaboration

Jiaqi Liu +35
cs.AI 2026-05-19 reviewed

Skills add almost no value to cybersecurity agents with rich tool feedback
When Skills Don't Help: A Negative Result on Procedural Knowledge for Tool-Grounded Agents in Offensive Cybersecurity

Samuel Jacob Chacko +3
cs.LG 2026-05-19 reviewed

Two antagonistic Bayesian processes set the optimal learning rate
Training Neural Networks with Optimal Double-Bayesian Learning

Vy Bui +4
cs.AI 2026-05-19 reviewed

Self-play with code rewards lifts geospatial AI by 5.5 points
GeoX: Mastering Geospatial Reasoning Through Self-Play and Verifiable Rewards

Kyeongjin Ahn +3
cs.LG 2026-05-19 reviewed

LLM benchmarks can be made unlearnable to stop contamination
LLM Benchmark Datasets Should Be Contamination-Resistant

Ali Al-Lawati +3
cs.SE 2026-05-19 reviewed

Agent skills from expert methods beat docs for PostgreSQL tuning
A Case for Agentic Tuning: From Documentation to Action in PostgreSQL

Hongyu Lin +6
cs.LG 2026-05-19 reviewed

Lookahead training improves neural routing policies
Learning with Foresight: Enhancing Neural Routing Policy via Multi-Node Lookahead Prediction

Xia Jiang +3
cs.LG 2026-05-19 reviewed

Block-sphere quantizer lowers MSE and inner-product error
Block-Sphere Vector Quantization

Heesang Ann +2
cs.LG 2026-05-19 reviewed

Entropy change-point detection spots fluent LLM jailbreaks
Detecting Fluent Optimization-Based Adversarial Prompts via Sequential Entropy Changes

Mohammed Alshaalan +1
eess.SP 2026-05-19 reviewed

Rule-based system stages sleep by encoding AASM manual in code
Staging by the Book: Automatic Sleep Stage Classification Using Scoring Rules

Emil Hardarson +5
cs.CV 2026-05-19 reviewed

World-ego split lifts long-horizon hybrid robot modeling
World-Ego Modeling for Long-Horizon Evolution in Hybrid Embodied Tasks

Zuyao Lin +5
cs.DC 2026-05-19 reviewed

GPU-aware expert mapping cuts MoE latency by 7.9 percent on average
GEM: GPU-Variability-Aware Expert to GPU Mapping for MoE Systems

Sourish Wawdhane +2
cs.LG 2026-05-19 reviewed

Position-dependent attention fixes constant risk on shifted reasoning
A Measure-Theoretic Analysis of Reasoning: Structural Generalization and Approximation Limits

Yuyang Zhang +3
cs.AI 2026-05-19 reviewed

Noise in recursion lifts tiny model puzzle accuracy to 99%
Probabilistic Tiny Recursive Model

Amin Sghaier +2
cs.AI 2026-05-19 reviewed

Robotics control ideas yield runtime guardrails for AI social interactions
Robotics-Inspired Guardrails for Foundation Models in Socially Sensitive Domains

Rebecca Ramnauth +2
cs.AI 2026-05-19 reviewed

Context map cache raises LLM agent accuracy 6-34% on recurring tasks
PEEK: Context Map as an Orientation Cache for Long-Context LLM Agents

Zhuohan Gu +3
cs.CV 2026-05-19 reviewed

Model fuses lidar and plot data for lower-bias forest biomass maps
StruMPL: Multi-task Dense Regression under Disjoint Partial Supervision and MNAR Labels

Reza M. Asiyabi +4
cs.CV 2026-05-19 reviewed

SplitQ keeps 93.5% accuracy at 3-bit VLM quantization
Breaking Modality Heterogeneity in Low-Bit Quantization for Large Vision-Language Models

Yi Zhong +4
cs.GT 2026-05-19 reviewed

Parallel CFR runs 3.3 times faster on billion-history poker trees
Real-Time Parallel Counterfactual Regret Minimization

Boning Li +1
cs.LG 2026-05-19 reviewed

Fast method learns node reps from labels without features
Fast and Featureless Node Representation Learning with Partial Pairwise Supervision

Sujan Chakraborty +1
cs.AI 2026-05-19 reviewed

CNN on solutions guides LLM to write 1000x faster streamliners
Streamlined Constraint Reasoning via CNN Pattern Recognition on Enumerated Solutions

Patrick Spracklen
cs.DC 2026-05-19 reviewed

Space Data Centers Process Satellite Data in Orbit
Deep Tech to Space: Space Data Centers and AI Revolution at the Edge

Jonas Weiss +18
cs.CV 2026-05-19 reviewed

Persona prompts lift construction safety checks by 12 percent
Passive Construction Site Safety Monitoring via Persona-Scaffolded Adversarial Chain-of-Thought VLM Verification

Ananth Sriram +2
cs.LG 2026-05-19 reviewed

Post-backprop rescaling fixes gradient scales in deep nets without BatchNorm
StableGrad: Backward Scale Control without Batch Normalization

Jose I. Mestre +4
cs.CV 2026-05-19 reviewed

Zero-shot image models fall short on concept faithfulness for XAI
A Framework for Evaluating Zero-Shot Image Generation in Concept-based Explainability

Giacomo Astolfi +4
cs.CV 2026-05-19 reviewed

Open VLMs struggle with fine details in human video actions
FineBench: Benchmarking and Enhancing Vision-Language Models for Fine-grained Human Activity Understanding

Gueter Josmy Faure +4
cs.CV 2026-05-19 reviewed

Dense benchmark exposes open VLMs' gaps on subtle human actions
FineBench: Benchmarking and Enhancing Vision-Language Models for Fine-grained Human Activity Understanding

Gueter Josmy Faure +4
cs.CV 2026-05-19 reviewed

Dual-stream network lifts weather detection at full speed
CADENet: Condition-Adaptive Asynchronous Dual-Stream Enhancement Network for Adverse Weather Perception in Autonomous Driving

Sherif Khairy +1
cs.LG 2026-05-19 reviewed

Framework fuses sensor data with physics rules for better passenger counts
A Closed-loop, State-centric, Multi-agent Framework for Passenger Load Estimation from Heterogeneous Data Streams

Yiyao Xu +3
cs.SD 2026-05-19 reviewed

Scaled simulations cut speech recognition errors over 30 percent
Mega-ASR: Towards In-the-wild^2 Speech Recognition via Scaling up Real-world Acoustic Simulation

Zhifei Xie +6
cs.AI 2026-05-19 reviewed

Structured simulator cuts wastewater regret by 43.6 percent
Explainable Wastewater Digital Twins: Adaptive Context-Conditioned Structured Simulators with Self-Falsifying Decision Support

Gary Simethy +2
cs.AI 2026-05-19 reviewed

Temporal conditioning changes AV planner style but not scores
From Prompts to Pavement Through Time: Temporal Grounding in Agentic Scene-to-Plan Reasoning

Ahmed Y. Gado +4
cs.LG 2026-05-19 reviewed

Domain cuts let neural operators handle PDE discontinuities
Smooth Piecewise Cutting for Neural Operator to Handle Discontinuities and Sharp Transitions

Ha Dang +2
cs.LG 2026-05-19 reviewed

Explainer splits stable and changing links for temporal GNNs
ST-TGExplainer: Disentangling Stability and Transition Patterns for Temporal GNN Interpretability

Hongjiang Chen +7
cs.CL 2026-05-19 reviewed

Rubric shows LLMs generate mostly high-quality legal propositions
LP-Eval: Rubric and Dataset for Measuring the Quality of Legal Proposition Generation

Shanshan Xu +4
cs.LG 2026-05-19 reviewed

Benchmark separates ML models on flux extrapolation via tail errors
FLUXtrapolation: A benchmark on extrapolating ecosystem fluxes

Anya Fries +4
cs.CL 2026-05-19 reviewed

Section-based chunking tops recall in German legal retrieval
Chunking German Legal Code

Max Prior +2
cs.LG 2026-05-19 reviewed

Laplace diffusion generates long forecasts for irregular time series
Latent Laplace Diffusion for Irregular Multivariate Time Series

Zinuo You +2
cs.CV 2026-05-19 reviewed

Stitched model lifts rewards to noisy latents for faster alignment
Stitched Value Model for Diffusion Alignment

Hyojun Go +10
cs.CV 2026-05-19 reviewed

Semi-supervised method reaches 79.99% Dice in fetal heart ultrasound
Synergistic Foundation Models for Semi-Supervised Fetal Cardiac Ultrasound Analysis: SAM-Med2D Boundary Refinement and DINOv3 Semantic Enhancement

Tonghao Zhuang (1) +7
cs.HC 2026-05-19 reviewed

Protocol captures synchronized multimodal meeting data
AffectAI-Capture: A Reproducible Multimodal Protocol for Small-Group Meeting Research

Meisam Jamshidi Seikavandi +8
cs.AI 2026-05-19 reviewed

LLMs optimize code via priors
Prior Knowledge or Search? A Study of LLM Agents in Hardware-Aware Code Optimization

Dmitry Redko (1) +9
cs.AI 2026-05-19 reviewed

Data-driven rule picks best SGD-to-Muon geometry per layer
From SGD to Muon: Adaptive Optimization via Schatten-p Norms

Thomas Massena (IRIT +4
cs.AI 2026-05-19 reviewed

Conformal methods deliver distribution-free coverage for AI agent scores
Distribution-Free Uncertainty Quantification for Continuous AI Agent Evaluation

Yuxuan Gao +2
cs.AI 2026-05-19 reviewed

Hard-coded verifiers beat LLM judges at matching human evaluations
OpenComputer: Verifiable Software Worlds for Computer-Use Agents

Jinbiao Wei +6
cs.AI 2026-05-19 reviewed

Variance-aware regret bound proven optimal for logistic MDPs
Minimax Optimal Variance-Aware Regret Bounds for Multinomial Logistic MDPs

Pierre Boudart (SIERRA) +4
cs.LG 2026-05-19 reviewed

Rank-1 queries keep ZO signals strong for high-rank LoRA
AR1-ZO: Topology-Aware Rank-1 Zeroth-Order Queries for High-Rank LoRA Fine-Tuning

Ziye Chen +5