archive

Every paper Pith has read. Search by title, abstract, or pith.

14513 papers in cs.AI · page 15

cs.RO 2026-05-19 reviewed

Adaptive coaching speeds learning to use robot guide dogs
CANINE: Coaching Visually Impaired Users for Interactive Navigation with a Robot Guide Dog

Cunjun Yu +4
cs.AI 2026-05-19 reviewed

Attention-guided RL raises jailbreak success on reasoning models
Attention-Guided Reward for Reinforcement Learning-based Jailbreak against Large Reasoning Models

Zheng Lin +4
cs.CV 2026-05-19 reviewed

GUI agents reach only 36% success on media editing tasks
CutVerse: A Compositional GUI Agents Benchmark for Media Post-Production Editing

Haobo Hu +6
cs.LG 2026-05-19 reviewed

Finite dynamics samples enforce safety during RL learning
Sampling-Based Safe Reinforcement Learning

Luca Vignola +6
cs.LG 2026-05-19 reviewed

Pre-training boosts time series detection by 375% but not forecasting
Quantifying the Pre-training Dividend: Generative versus Latent Self-Supervised Learning for Time Series Foundation Models

Noam Major +2
cs.AI 2026-05-19 reviewed

Group reward targets keep solution diversity alive in RL reasoning
Beyond Mode Collapse: Distribution Matching for Diverse Reasoning

Xiaozhe Li +12
cs.AI 2026-05-19 reviewed

GUIDE raises ad GMV 4.1% with built-in safety fallback
Generative Auto-Bidding with Unified Modeling and Exploration

Mingming Zhang +9
cs.DC 2026-05-19 reviewed

Predictor accuracy sets exact fault tolerance in Byzantine agreement
Resilient Byzantine Agreement with Predictions

Julien Dallot +4
cs.AI 2026-05-19 reviewed

Selective feedback reweighting lifts multi-turn agent success to 90%
What and When to Distill: Selective Hindsight Distillation for Multi-Turn Agents

Xiaozhe Li +8
cs.CV 2026-05-19 reviewed

Targeted attacks succeed on encoders without knowing the task
Targeted Downstream-Agnostic Attack

Zhuxin Lei +2
cs.LG 2026-05-19 reviewed

Spiking blocks replace Transformer nonlinearities with <1% accuracy drop
Plug-and-Play Spiking Operators: Breaking the Nonlinearity Bottleneck in Spiking Transformers

Xinzhe Yuan (1) +6
cs.LG 2026-05-19 reviewed

Majority vote locks wrong answers after brief correct window in TTRL
Detecting and Mitigating the Correct-Answer Extinction Window in Test-Time Reinforcement Learning with Majority Voting

Hongxiang Lin +3
cs.LG 2026-05-19 reviewed

Model fuses layout and netlist to predict cell delay at 0.92% error
FusionCell: Cross-Attentive Fusion of Layout Geometry and Netlist Topology for Standard-Cell Performance Prediction

Haoyi Zhang +4
cs.CV 2026-05-19 reviewed

Prototype-anchored training halves calibration error in place recognition
KappaPlace: Learning Hyperspherical Uncertainty for Visual Place Recognition via Prototype-Anchored Supervision

Maya Yanko +1
cs.CL 2026-05-19 reviewed

Backtracking fixes dual biases in LLM reasoning distillation
Backtracking When It Strays: Mitigating Dual Exposure Biases in LLM Reasoning Distillation

Bing Wang +9
cs.LG 2026-05-19 reviewed

Output-layer gradient norm gates reuse to cut RLVR samples by 2.93x
When to Stop Reusing: Dynamic Gradient Gating for Sample-Efficient RLVR

Yuchun Miao +6
eess.SP 2026-05-19 reviewed

Pilot-only model beats full-CSI baselines across frequencies
PilotWiMAE: Pilot-Native Representation Learning for Wireless Channels

Berkay Guler +2
cs.AI 2026-05-19 reviewed

Signed graphs let AI agents resolve conflicts for better reasoning
Conflict-Resilient Multi-Agent Reasoning via Signed Graph Modeling

Longgang He +3
cs.LG 2026-05-19 reviewed

Feedback prefixing improves LLM scaling by up to 2.8x efficiency
Introspective X Training: Feedback Conditioning Improves Scaling Across all LLM Training Stages

Brandon Cui +9
cs.LG 2026-05-19 reviewed

ODE paths limit forgetting when merging models sequentially
Unlocking the Potential of Continual Model Merging: An ODE Perspective

Lihong Lin +1
cs.LG 2026-05-19 reviewed

ODE traces low-loss paths for sequential model merging
Unlocking the Potential of Continual Model Merging: An ODE Perspective

Lihong Lin +1
cs.LG 2026-05-19 reviewed

Large models improve with unfiltered low-quality data
A Bitter Lesson for Data Filtering

Christopher Mohri +2
cs.CV 2026-05-19 reviewed

JUDO outperforms GPT-4o on industrial anomaly QA with normal image references
JUDO: A Juxtaposed Domain-Oriented Multimodal Reasoner for Industrial Anomaly QA

Hyunju Kang +3
cs.CV 2026-05-19 reviewed

Rebalancing attention boosts motion in image-to-video models
Rebalancing Reference Frame Dominance to Improve Motion in Image-to-Video Models

Wooseok Jeon +5
cs.CV 2026-05-19 reviewed

Rebalancing attention reduces reference dominance and increases video motion
Rebalancing Reference Frame Dominance to Improve Motion in Image-to-Video Models

Wooseok Jeon +5
cs.CV 2026-05-19 reviewed

Unlearning methods leave class traces in model representations
Can Vision Models Truly Forget? Mirage: Representation-Level Certification of Visual Unlearning

Zhenyu Yu +4
cs.CL 2026-05-19 reviewed

Reassembling entity pairs boosts synthetic QA accuracy by 88.9%
EmbGen: Teaching with Reassembled Corpora

Arun K Lenin +3
cs.AI 2026-05-19 reviewed

LLMs run code for videos but miss spatial accuracy
PRISM: A Benchmark for Programmatic Spatial-Temporal Reasoning

Qiran Zhang +11
cs.LG 2026-05-19 reviewed

LLM safety benchmarks are orbits under group actions
The Evaluation Game: Beyond Static LLM Benchmarking

Paul Wang +3
cs.AI 2026-05-19 reviewed

Stochastic trajectories let recursive models reason with multiple hypotheses
Generative Recursive Reasoning

Junyeob Baek +5
cs.AI 2026-05-19 reviewed

Probabilistic recursion lets models sample many reasoning paths
Generative Recursive Reasoning

Junyeob Baek +5
cs.CV 2026-05-19 reviewed

Concept ontology filters noisy negatives to lift chest X-ray zero-shot tasks
Concept-Guided Noisy Negative Suppression for Zero-Shot Classification and Grounding of Chest X-Ray Findings

Chenyu Lian +3
cs.CV 2026-05-19 reviewed

Heat dissipation flow matching outperforms most baselines
Multi-Scale Generative Modeling with Heat Dissipation Flow Matching

Jun Ma +4
cs.HC 2026-05-19 reviewed

Few agent skill specs fully disclose capabilities to users
Toward User Comprehension Supports for LLM Agent Skill Specifications

Zikai Alex Wen
cs.HC 2026-05-19 reviewed

Only 19% of cybersecurity skills include example cues for users
Toward User Comprehension Supports for LLM Agent Skill Specifications

Zikai Alex Wen
cs.GR 2026-05-19 reviewed

Repositioned anchors keep motion contacts across body shapes
Skinned Motion Retargeting with Spatially Adaptive Interaction Guidance

Soojin Choi +5
q-bio.NC 2026-05-19 reviewed

Action models align asymmetrically with brain action signals
Brain alignment of reasoning and action representations from vision-language and action models during naturalistic gameplay

Subba Reddy Oota +6
cs.MA 2026-05-19 reviewed

Architecture lets AI agents break rules legitimately when justified
PAVE: A Cognitive Architecture for Legitimate Violation in Generative Agent Societies

Ahmad Yehia +6
cs.LG 2026-05-19 reviewed

Claim differences as RL rewards balance caption hallucinations and omissions
ClaimDiff-RL: Fine-Grained Caption Reinforcement Learning through Visual Claim Comparison

Tianle Li +9
cs.CL 2026-05-19 reviewed

Supreme Court quashes 18 points more matrimonial petitions than Karnataka HC
IMLJD: A Computational Dataset for Indian Matrimonial Litigation Analysis

Joy Bose
cs.CV 2026-05-19 reviewed

Integral feedback reduces hallucinations in CT medical reports
Regulating Anatomy-Aware Rewards via Trajectory-Integral Feedback for Volumetric Computed Tomography Analysis

Tianwei Lin +9
cs.CL 2026-05-19 reviewed

Benchmark labels hallucinations via explicit reference worlds
HalluWorld: A Controlled Benchmark for Hallucination via Reference World Models

Emmy Liu +6

5 Piths
cs.MA 2026-05-19 reviewed

STAR-PólyaMath hits perfect scores on Putnam and IMO
STAR-P\'olyaMath: Multi-Agent Reasoning under Persistent Meta-Strategic Supervision

Jiaao Wu +5
cs.AI 2026-05-19 reviewed

Only 2 of 19 LLM trading studies use time-consistent data splits
Agentic Trading: When LLM Agents Meet Financial Markets

Yihan Xia +6
q-bio.QM 2026-05-19 reviewed

Protein Thoughts ranks true binders at mean position 11.2
Protein Thoughts: Interpretable Reasoning with Tree of Thoughts and Embedding-Space Flow Matching for Protein-Protein Interaction Discovery

Kingsley Yeon +2
cs.GT 2026-05-19 reviewed

LLMs close 99% of deals but earn low profits in hidden pricing
PrefBench: Evaluating Zero-Shot LLM Agents in Hidden-Preference Personalized Pricing Negotiations

Yingjie Lei
cs.AI 2026-05-19 reviewed

MOCHA improves agent skill correctness on every task
MOCHA: Multi-Objective Chebyshev Annealing for Agent Skill Optimization

Md Mehrab Tanjim +8
cs.CV 2026-05-19 reviewed

Event streams lift VLM captioning and VQA scores in low light and motion
RE-VLM: Event-Augmented Vision-Language Model for Scene Understanding

Hanqing Liu +5
cs.CV 2026-05-19 reviewed

Event streams improve VLM scene understanding in tough conditions
RE-VLM: Event-Augmented Vision-Language Model for Scene Understanding

Hanqing Liu +5
cs.CR 2026-05-19 reviewed

Small models flag jailbreaks before large models answer
Exploring and Developing a Pre-Model Safeguard with Draft Models

Hongyu Cai +4