archive

Every paper Pith has read. Search by title, abstract, or pith.

14513 papers in cs.AI · page 10

cs.LG 2026-05-20 reviewed

DASH discovers strong hybrid attention for LLMs in 20 minutes on one GPU
DASH: Fast Differentiable Architecture Search for Hybrid Attention in Minutes on a Single GPU

Weizhe Chen +5
cs.CL 2026-05-20 reviewed

Strategy induction from questions alone improves LLM task instructions
Strategy-Induct: Task-Level Strategy Induction for Instruction Generation

Po-Chun Chen +2
cs.LO 2026-05-20 reviewed

Vector-clock monitor matches causal-guard semantics locally
Causal Past Logic for Runtime Verification of Distributed LLM Agent Workflows

Benedikt Bollig
cs.LG 2026-05-20 reviewed

Oscillatory network scales to ImageNet with high efficiency
Winfree Oscillatory Neural Network

Jiawen Dai +1
cs.LG 2026-05-20 reviewed

One program decodes bundles at 100% on four frozen embeddings
Sutra: Tensor-Op RNNs as a Compilation Target for Vector Symbolic Architectures

Emma Leonhart
cs.LG 2026-05-20 reviewed

Sutra compiles VSA programs to tensor graphs with exact decoding
Sutra: Tensor-Op RNNs as a Compilation Target for Vector Symbolic Architectures

Emma Leonhart
cs.CL 2026-05-20 reviewed

Unlearned models keep low calibration but lean on shortcuts
Calibration vs Decision Making: Revisiting the Reliability Paradox in Unlearned Language Models

Divyaksh Shukla +1
cs.AI 2026-05-20 reviewed

Fighting game AIs learn how long to hold each move
For How Long Should We Be Punching? Learning Action Duration in Fighting Games

Hoang Hai Nguyen +2
cs.CV 2026-05-20 reviewed

VISTA wins Ego4D STA challenge by fusing frozen video features into detector
VISTA: Technical Report for the Ego4D Short-Term Object Interaction Anticipation at EgoVis 2026

Qiaohui Chu +6
cs.CR 2026-05-20 reviewed

Agent finds hidden threats in 15% of security incidents
GenAI-Driven Threat Detection with Microsoft Security Copilot

Scott Freitas +1
cs.CR 2026-05-20 reviewed

Agent surfaces novel threats in 15% of security incidents
GenAI-Driven Threat Detection with Microsoft Security Copilot

Scott Freitas +1
cs.CR 2026-05-20 reviewed

Frequency regularization lifts attack transfer to closed MLLMs
Frequency-Domain Regularized Adversarial Alignment for Transferable Attacks against Closed-Source MLLMs

Leitao Yuan +7
cs.CL 2026-05-20 reviewed

Skill synthesis scales terminal-agent data to beat baselines with 1% of it
Terminal-World: Scaling Terminal-Agent Environments via Agent Skills

Zihao Cheng +8
cs.AI 2026-05-20 reviewed

Five checkpoints enforce policy in generalist agents
Governance by Construction for Generalist Agents

Segev Shlomov +9
cs.AI 2026-05-20 reviewed

Taxonomy-based generator yields verifiable planning data for LLMs
PlanningBench: Generating Scalable and Verifiable Planning Data for Evaluating and Training Large Language Models

Ziliang Zhao +9
cs.LG 2026-05-20 reviewed

Gradient moment method cuts 3D Gaussian count by 85-97%
CAdam: Context-Adaptive Moment Estimation for 3D Gaussian Densification in Generative Distillation

SeungJeh Chung +3
cs.LG 2026-05-20 reviewed

Runtime bounds certify quantized KV attention with exact fallback
Runtime-Certified Bounded-Error Quantized Attention

Dean Calver
cs.LG 2026-05-20 reviewed

N-step correction tightens PPO bound for RL with verifiable rewards
Multi-Step Likelihood-Ratio Correction for Reinforcement Learning with Verifiable Rewards

Deokgyu Yoon +6
cs.SI 2026-05-20 reviewed

Multi-metric score spots synthetic narratives more reliably
Detecting Synthetic Political Narratives in Cross-Platform Social Media Discourse

Despoina Antonakaki +1
cs.RO 2026-05-20 reviewed

Hypernetwork generates full robot policies from instructions alone
DISC: Decoupling Instruction from State-Conditioned Control via Policy Generation

Hanxiang Ren +3
cs.CV 2026-05-20 reviewed

224K short videos collected by labels support semantic benchmarks
USV: Towards Understanding the User-generated Short-form Videos

Haoyue Cheng +5
cs.CV 2026-05-20 reviewed

New benchmark shows VLMs lag trained humans on building layouts
ArchSIBench: Benchmarking the Architectural Spatial Intelligence of Vision-Language Models

Qirui Shen +7
cs.AI 2026-05-20 reviewed

DPO matches RLHF only if optimal policy favors human responses
Conditional Equivalence of DPO and RLHF: Implicit Assumption, Failure Modes, and Provable Alignment

Zhiqin Yang +5
cs.CL 2026-05-20 reviewed

7B open LLMs run GraphRAG locally for EHR schema queries
GraphRAG on Consumer Hardware: Benchmarking Local LLMs for Healthcare EHR Schema Retrieval

Peter Fernandes +1
cs.LG 2026-05-20 reviewed

Preference vector tunes task balance in merged continual learning models
Tunable MAGMAX: Preference-Aware Model Merging for Continual Learning

Kei Hiroshima +2
cs.AR 2026-05-20 reviewed

ELSA gives spiking networks 3.4x faster inference than top accelerators
ELSA: An ELastic SNN Inference Architecture for Efficient Neuromorphic Computing

Kang You +8
cs.AI 2026-05-20 reviewed

Local writes accumulate into global solutions in recursive reasoners
Interaction Locality in Hierarchical Recursive Reasoning

Yosuke Miyanishi +1
cs.AI 2026-05-20 reviewed

New guidance resolves gradient conflicts in flow models
Conflict-Aware Additive Guidance for Flow Models under Compositional Rewards

Xuehui Yu +4
cs.LG 2026-05-20 reviewed

Bias correction cuts pretraining loss in AdamW and similar optimizers
Correcting Stochastic Update Bias in Preconditioned Language Model Optimizers

Nikhil Nayak +9
cs.LG 2026-05-20 reviewed

Distillation from richer pseudo-samples improves sparse glucose estimates
PACD-Net: Pseudo-Augmented Contrastive Distillation for Glycemic Control Estimation from SMBG

Canyu Lei +2
cs.LG 2026-05-20 reviewed

GLU shrinks NTK condition number for faster convergence
The Devil is in the Condition Numbers: Why is GLU Better than non-GLU Structure?

Xingyu Lyu +4
cs.LG 2026-05-20 reviewed

Hidden states at paragraph boundaries tune verifier strictness
The Hidden Signal of Verifier Strictness: Controlling and Improving Step-Wise Verification via Selective Latent Steering

Yefan Zhou +5
cs.LG 2026-05-20 reviewed

Testbed embeds detectable hacks for automatic reward-gaming checks
Hack-Verifiable Environments: Towards Evaluating Reward Hacking at Scale

Amit Roth +4
cs.AI 2026-05-20 reviewed

Text modeling of EV battery signals enables LLM fault diagnosis
VBFDD-Agent for Electric Vehicle Battery Fault Detection and Diagnosis: Descriptive Text Modeling of Battery Digital Signals

Joey Chan +2
cs.LG 2026-05-20 reviewed

RL scores full distributions to fix LLM regression
Distribution-Aware Reward: Reinforcement Learning over Predictive Distributions for LLM Regression

Jungsoo Park +6
cs.CR 2026-05-20 reviewed

Monitor reduces LLM agent covert channels to zero capacity
An Application-Layer Multi-Modal Covert-Channel Reference Monitor for LLM Agent Egress

Alfredo Metere
cs.CV 2026-05-20 reviewed

Designer ratings dataset lifts AI graphic scorer to 0.611 agreement
TASTE: A Designer-Annotated Multi-Dimensional Preference Dataset for AI-Generated Graphic Design

Haonan Zhu +4
cs.CL 2026-05-20 reviewed

Aligning task vectors to in-context next-token distributions lifts accuracy 9.2%
Distributional Alignment as a Criterion for Designing Task Vectors in In-Context Learning

Jihoon Kwon +2
cs.LG 2026-05-20 reviewed

Group statistics adapt clipping and temperature to lift LLM math scores
AGPO: Adaptive Group Policy Optimization with Dual Statistical Feedback

Miaobo Hu +7
cs.CV 2026-05-20 reviewed

SAVER selectively activates vision to boost F1 and cut latency in multimodal IE
SAVER: Selective As-Needed Vision Evidence for Multimodal Information Extraction

Miaobo Hu +7
cs.CL 2026-05-20 reviewed

Categorical error rates beat WER for Indic speech recognition
SCRIBE: Diagnostic Evaluation and Rich Transcription Models for Indic ASR

Kavya Manohar +3
cs.CV 2026-05-20 reviewed

DAR cuts DiT training iterations by 8.75x while improving FID by 2.11
Rethinking Cross-Layer Information Routing in Diffusion Transformers

Chao Xu +11
cs.DC 2026-05-20 reviewed

WebGPU backend cuts LLM memory use by 29-33% in browsers
Llamas on the Web: Memory-Efficient, Performance-Portable, and Multi-Precision LLM Inference with WebGPU

Reese Levine +7
cs.CR 2026-05-20 reviewed

Heartbeat protocol revokes AI swarm credentials within fixed window
Heartbeat-Bound Hierarchical Credentials: Cryptographic Revocation for AI Agent Swarms

Saurabh Deochake
cs.AI 2026-05-20 reviewed

Agentic system solves 8 of 10 research math problems
RMA: an Agentic System for Research-Level Mathematical Problems

Zelin Zhao +3
cs.CL 2026-05-20 reviewed

Agreement screening yields clearer text features at full accuracy
Interpretable Discriminative Text Representations via Agreement and Label Disentanglement

Tong Wang +2
cs.AI 2026-05-20 reviewed

Typed contracts let agents compose data systems reliably
Declarative Data Services: Structured Agentic Discovery for Composing Data Systems

Shanshan Ye +1
cs.CL 2026-05-20 reviewed

Self-limiting losses compress embeddings without overfitting
DIVE: Embedding Compression via Self-Limiting Gradient Updates

Dongfang Zhao
cs.LG 2026-05-20 reviewed

Dynamic experts cut error on shifting time series
Dynamic TMoE: A Drift-Aware Dynamic Mixture of Experts Framework for Non-Stationary Time Series Forecasting

Jiawen Zhu +3
cs.CL 2026-05-20 reviewed

AI reviewer beats top human on Nature papers
On the limits and opportunities of AI reviewers: Reviewing the reviews of Nature-family papers with 45 expert scientists

Seungone Kim +57