archive

Every paper Pith has read. Search by title, abstract, or pith.

14513 papers in cs.AI · page 1

cs.AI 2026-05-22 reviewed

Optimizer model improves agent skills only via validation-raising text edits
SkillOpt: Executive Strategy for Self-Evolving Agent Skills

Yifan Yang +14
cs.LG 2026-05-22 reviewed

Shannon capacity produces U-shaped LLM scaling curves
LLMs as Noisy Channels: A Shannon Perspective on Model Capacity and Scaling Laws

Xu Ouyang +7
cs.AI 2026-05-22 reviewed

Model-generated agent skills help on average but trigger negative transfer
From Raw Experience to Skill Consumption: A Systematic Study of Model-Generated Agent Skills

Zisu Huang +15
cs.AI 2026-05-22 reviewed

VLMs fail to ground numbers in spatial layouts
SPACENUM: Revisiting Spatial Numerical Understanding in VLMs

Jianshu Zhang +6
cs.CV 2026-05-22 reviewed

Dedicated image editor lifts multimodal reasoning by 5 points
ETCHR: Editing To Clarify and Harness Reasoning

Beichen Zhang +5
cs.CV 2026-05-22 reviewed

Token selection speeds geometry transformers over 85 percent
Good Token Hunting: A Hitchhiker's Guide to Token Selection for Visual Geometry Transformers

Shuhong Zheng +5
cs.DB 2026-05-22 reviewed

CHRONOS unifies index decay, pricing and privacy in data markets
CHRONOS: Temporally-Aware Multi-Agent Coordination for Evolving Data Marketplaces

Joydeep Chandra
cs.CV 2026-05-22 reviewed

Geometric overlays on images lift MLLM spatial scores by 20%
PGT: Procedurally Generated Tasks for improving visual grounding in MLLMs

Rim Assouel +3
cs.HC 2026-05-22 reviewed

Persuasive LLM explanations do not raise decision accuracy
Human Decision-Making with Persuasive and Narrative LLM Explanations

Laura R. Marusich +3
cs.LG 2026-05-22 reviewed

Foundation models support zero-shot causal image reasoning
Leveraging Foundation Models for Causal Generative Modeling

Aneesh Komanduri +1
cs.LG 2026-05-22 reviewed

Post-training, not pre-training data, creates LLM geopolitical bias
It's the humans, not the data: Geopolitical bias in LLMs originates in post-training, amplified by the language of the prompt

Stuart Bladon +1
cs.CV 2026-05-22 reviewed

Vision models match humans best at balanced generative-discriminative mix
Not Too Generative, Not Too Discriminative: The Human Alignment Sweet Spot

Jorge Chang Ortega +3
cs.AI 2026-05-22 reviewed

Adversarial alignment generalizes multimodal knowledge edits
Beyond Binary Edits Robust Multimodal Knowledge Editing with Adversarial Subspace Alignment

Haoyuan Wang +4
cs.AI 2026-05-22 reviewed

Claude agent verifies programs at 98 percent success rate
Agentic Proving for Program Verification

Alessandro Sosso +2
cs.CV 2026-05-22 reviewed

Agent beats baselines at text-guided 3D photo search
PhotoFlow: Agentic 3D Virtual Photography Missions

Jiarui Guo +7
cs.RO 2026-05-22 reviewed

Any2Any moves tracking models to new robots at 1% cost
Any2Any: Efficient Cross-Embodiment Transfer for Humanoid Whole-Body Tracking

Ming Yang +3
cs.AI 2026-05-22 reviewed

MemAudit cuts memory poisoning success to zero after attacks
MemAudit: Post-hoc Auditing of Poisoned Agent Memory via Causal Attribution and Structural Anomaly Detection

Zhewen Tan +11
cs.CL 2026-05-22 reviewed

Recursive memory predicts next queries with 22x fewer tokens
OnePred: Next-Query Prediction via Recursive Intent Memory in Multi-Turn Conversations

Jiangwang Chen +6
cs.CV 2026-05-22 reviewed

Adaptive search fixes blind spots in high-res image perception for LLMs
CVSearch: Empowering Multimodal LLMs with Cognitive Visual Search for High-Resolution Image Perception

Liupeng Li +6
cs.AI 2026-05-22 reviewed

One shared RL policy controls thousands of NPCs with distinct personas
One Policy, Infinite NPCs: Persona-Traceable Shared RL Policies for Scalable Game Agents

Yoosung Hong
cs.LG 2026-05-22 reviewed

Compatible output heads let students learn from noise
Learning Through Noise: Why Subliminal Learning Works and When It Fails

Vincent C. Brockers +4
cs.CR 2026-05-22 reviewed

Temporal gaps weaken Android malware model defenses
Adversarial Vulnerability Under Temporal Concept Drift: A Longitudinal Study of Android Malware Detection

Ahmed Sabbah +4
cs.CV 2026-05-22 reviewed

Entity patches in memory fix consistency in multi-shot videos
EM-Vid: Training-Free Entity-Centric Memory for Efficient and Consistent Multi-Shot Video Generation

Jente Vandersanden +4
cs.LG 2026-05-22 reviewed

Latent space lets diffusion language models sample faster with better quality
DiLaDiff: Distilled Latent-Augmented Diffusion for Language Modeling

Jean-Marie Lemercier +5
cs.LG 2026-05-22 reviewed

Hysteretic attention reaches Turing completeness in constant depth
Preisach Attention: A Hysteretic Model of Sequential Memory

Piotr Frydrych
cs.LG 2026-05-22 reviewed

Meta-learning yields model performance scores on unlabeled data
Learning to Evaluate: Cost-Effective Model Evaluation on Unlabeled Data with Meta-Learning

Trinh Pham +4
cs.AI 2026-05-22 reviewed

Models schedule up to 1450 aircraft disassembly tasks
Solving the Aircraft Disassembly Scheduling Problem

Charles Thomas +1
cs.AI 2026-05-22 reviewed

Rubrics guide ReAct agents at each step for better search trajectories
Co-ReAct: Rubrics as Step-Level Collaborators for ReAct Agents

Jiazheng Kang +6
cs.IR 2026-05-22 reviewed

Three-phase recipe keeps 98% precision in 190M retrieval models
HARNESS-LM: A Three-Phase Training Recipe for Harnessing SLMs in Sponsored Search Retrieval

Vipul Gupta +6
cs.AI 2026-05-22 reviewed

Hybrid DP-CP solves partial shop scheduling with flexible precedences
CP or DP? Why Not Both: A Case Study in the Partial Shop Scheduling Problem

Emma Legrand +2
cs.LG 2026-05-22 reviewed

Latent policy gradients forecast RL goal generalization
Understanding Goal Generalisation in Sequential Reinforcement Learning

Jason Ross Brown +1
cs.MA 2026-05-22 reviewed

ARMS learns shaping rewards in MARL without altering Nash equilibria
ARMS: Automatic Reward Shaping for Sparse-Reward Multi-Agent Reinforcement Learning

Elie Abboud +1
cs.CV 2026-05-22 reviewed

PathNavigate scans slides for surprises before matching the question
PathNavigate: A Training-Free Pathology Agent with Surprise-Guided Scan and Shared Slide Memory for Whole-Slide Image VQA

Chunze Yang +12
cs.LG 2026-05-22 reviewed

One network pass trains an agent on every goal at once
Goal-Conditioned Agents that Learn Everything All at Once

Michael Matthews +7
math.OC 2026-05-22 reviewed

Randomized screening yields directional stationarity in max-DC programs
RA-DCA: A Randomized Active-Set DCA for Directional Stationarity in Max-Structured DC Programs

Yi-Shuai Niu
cs.LG 2026-05-22 reviewed

New sampler cuts RL training time for flow models by up to 53%
Precise: SDE-Consistent Stochastic Sampling for RL Post-Training of Flow-Matching Models

Jade Zou +9
cs.GR 2026-05-22 reviewed

Sketches control long video generation via independent shots
DrawVideo: Generating Long Video from Storyboard Keyframe Sketches

Chuanzhi Xu +9
cs.LG 2026-05-22 reviewed

Velocity consistency shapes embeddings for top time series anomaly detection
VACE: Learning Geometrically Structured Representations for Time Series Anomaly Detection

Alberto D. Cencillo +3
cs.AI 2026-05-22 reviewed

Guided rollouts and masks fix distillation of target identities
EDGE-OPD: Internalizing Privileged Context with Evidence Guided On-Policy Distillation

Aristotelis Lazaridis +5
cs.LG 2026-05-22 reviewed

Self-generated tests and code co-evolve to match RLVR results
CoSPlay: Cooperative Self-Play at Test-Time with Self-Generated Code and Unit Test

Zhangyi Hu +8
cs.CV 2026-05-22 reviewed

MDM distills vision-language datasets into compact synthetic sets
Multimodal Distribution Matching for Vision-Language Dataset Distillation

Jongoh Jeong +3
cs.CV 2026-05-22 reviewed

One model forecasts yields for many crops by learning their weather responses
PhenoYieldNet: Learning Crop-Aware Phenological Responses for Multi-Crop Yield Prediction

Yu Luo +6
cs.LG 2026-05-22 reviewed

DSEBO switches subspace dimension on convergence
Automated Random Embedding for Practical Bayesian Optimization with Unknown Effective Dimension

Hong Qian +7
cs.LG 2026-05-22 reviewed

CBANet raises minority recall in aggressive driving detection
CBANet: A Compact Attention-Based CNN-BiLSTM Network for Aggressive Driving Event Detection

Hanadi Alhamdan +3
cs.LG 2026-05-22 reviewed

Static contexts make individual dynamics identifiable from single snapshots
Learning Individual Dynamics from Sparse Cross-Sectional Snapshots

Christian Lagemann +3
cs.SE 2026-05-22 reviewed

Enterprise AI needs risk reduction testing
AI Assurance: A Comprehensive Testing Strategy for Enterprise AI Systems

Chitra Badagi +3
cs.CV 2026-05-22 reviewed

One-Forcing scores 83.76 on VBench for one-step video
One-Forcing: Towards Stable One-Step Autoregressive Video Generation

Jiaqi Feng +3
cs.CR 2026-05-22 reviewed

AI security papers favor attacks over defenses via uneven evaluations
AI Security Research Should Better Incentivize Defense Research

Youqian Zhang
cs.CL 2026-05-22 reviewed

SSDAU cuts ambiguity F1 drop in joint extraction from 32% to 8%
SSDAU: Structured Semantic Data Augmentation for Joint Entity and Relation Extraction

Jiawei He +7
cs.HC 2026-05-22 reviewed

Humans identify AI teammates at chance levels in group chats
Socially fluent AI decouples conversational signals from source identity in online interaction

Lixiang Yan +3