archive

Every paper Pith has read. Search by title, abstract, or pith.

14513 papers in cs.AI · page 11

cs.AI 2026-05-20 reviewed

Verifier rewards train neural translator to 86% LTL satisfiability
NeuroNL2LTL: A Neurosymbolic Framework for Natural Language Translation of Linear Temporal Logic

Paapa Kwesi Quansah +1
cs.LG 2026-05-20 reviewed

Reflector embeds reflection to block indirect jailbreaks
REFLECTOR: Internalizing Step-wise Reflection against Indirect Jailbreak

Jiachen Ma +5
cs.LG 2026-05-20 reviewed

Early entropy drop signals when CoT reasoning helps LLMs
When Do LLMs Reason? A Dynamical Systems View via Entropy Phase Transitions

Wei Xia +3
eess.SP 2026-05-20 reviewed

Attention model doubles perfect multi-user Wi-Fi activity predictions
AMAR: Lightweight Attention-Based Multi-User Activity Recognition from Wi-Fi CSI

Amirhossein Mohammadi +1
cs.RO 2026-05-20 reviewed

Joint action-predicate model enables zero-shot robot skill composition
Jointly Learning Predicates and Actions Enables Zero-Shot Skill Composition

Benedict Quartey +5
cs.LG 2026-05-20 reviewed

RL method produces ready-to-bend pipes for aeroengines
Design for Manufacturing: A Manufacturability Knowledge-Integrated Reinforcement Learning Framework for Free-Form Pipe Routing in Aeroengines

Caicheng Wang +6
cs.LG 2026-05-20 reviewed

Self-distillation balances consensus across views to cut noise from privileged signals
AVSD: Adaptive-View Self-Distillation by Balancing Consensus and Teacher-Specific Privileged Signals

Duy Nguyen +9
cs.CR 2026-05-20 reviewed

LLM compilation creates hidden backdoor attack surface
Trusted Weights, Treacherous Optimizations? Optimization-Triggered Backdoor Attacks on LLMs

Yifei Wang +5
cs.CV 2026-05-20 reviewed

Training supervision lifts portrait alignment
Pareto-Enhanced Portrait Generation: Vision-Aligned Text Supervision for Alignment, Realism, and Aesthetics

Yunlong Wang +5
cs.AI 2026-05-20 reviewed

Temporal cache delivers 30.6x speedup on hits in agent pipelines
Evaluating Temporal Semantic Caching and Workflow Optimization in Agentic Plan-Execute Pipelines

Alimurtaza Mustafa Merchant +5
cs.CL 2026-05-20 reviewed

Pipeline triples accuracy for Indigenous image captions
Retrieval-Augmented Long-Context Translation for Cultural Image Captioning: Gators submission for AmericasNLP 2026 shared task

Aashish Dhawan +4
cs.CV 2026-05-20 reviewed

Autoregressive diffusion cuts video restoration latency to seconds
Accelerating Video Inverse Problem Solvers with Autoregressive Diffusion Models

Taesung Kwon +3
math.AP 2026-05-20 reviewed

AI generates proofs for lower bounds on advection-diffusion mixing
Lower Bounds for Advection-Diffusion Equations: An Exploration with AI-Generated Proofs

Chenyang An +1
cs.AI 2026-05-20 reviewed

Agents learn when to jump in vehicle routing search
COAgents: Multi-Agent Framework to Learn and Navigate Routing Problems Search Space

Oleksandr Yakovenko +6
cs.CV 2026-05-20 reviewed

Animate-inanimate split structures vision MoE experts stably
Beyond Routing: Characterising Expert Tuning and Representation in Vision Mixture-of-Experts

Gene Tangtartharakul +1
cs.AI 2026-05-20 reviewed

Multi-agent system cuts 5G repair time by 86 percent
From Automated to Autonomous: Hierarchical Agent-native Network Architecture (HANA)

Binghan Wu +5
cs.CL 2026-05-20 reviewed

Self-training amplifies surface markers while deep syntax dies
Self-Training Doesn't Flatten Language -- It Restructures It: Surface Markers Amplify While Deep Syntax Dies

Ming Liu
cs.LG 2026-05-20 reviewed

Failure notes lift diagnostic AI accuracy up to 7%
MedExpMem: Adapting Experience Memory for Differential Diagnosis

Qianhan Feng +6
cs.LG 2026-05-20 reviewed

Unlearning by shifting erased points to retained semantic neighbors
Approximate Machine Unlearning through Manifold Representation Forgetting Guided by Self Mode Connectivity

Weiqi Wang +4
cs.AI 2026-05-20 reviewed

JAX simulator runs Mahjong at 2 million steps per second
Mahjax: A GPU-Accelerated Mahjong Simulator for Reinforcement Learning in JAX

Soichiro Nishimori +5
cs.LG 2026-05-20 reviewed

Small models copy last CoT number for 89-92% of arithmetic accuracy
The Readout Shortcut: Positional Number Copying Dominates Arithmetic CoT Readout in Small Language Models

Ming Liu
cs.MA 2026-05-19 reviewed

State management beats workspace isolation in multi-agent tasks
Multi-agent Collaboration with State Management

Mengyang Liu +4
cs.LG 2026-05-19 reviewed

Logit averaging in GRPO matches KL-regularized accuracy
Complementing reinforcement learning with SFT through logit averaging in the post training of LLMs

Xingwei Gan +1
cs.AI 2026-05-19 reviewed

AI agents enable precise tests of negotiator personality
Personality Engineering with AI Agents: A New Methodology for Negotiation Research

Michelle A. Vaccaro +1
cs.CV 2026-05-19 reviewed

Weighted clusters plus pruning give flexible speed-accuracy control in VPR
Faster or Stronger: Towards Flexible Visual Place Recognition via Weighted Aggregation and Token Pruning

Zichao Zeng +6
cs.LG 2026-05-19 reviewed

Learn image-space generators matching latent-process marginals
Latent Process Generator Matching

Lukas Billera +2
cs.LG 2026-05-19 reviewed

Geometric axioms explain neural network mechanisms
Axiomatizing Neural Networks via Pursuit of Subspaces

Mehmet Yamac +6
cs.AI 2026-05-19 reviewed

LLM agent accuracy drops to 0.54-0.62 without labels
AgentAtlas: Beyond Outcome Leaderboards for LLM Agents

Parsa Mazaheri +1
cs.CL 2026-05-19 reviewed

Co-occurrence patterns support subject-verb agreement learning
Collocational bootstrapping: A hypothesis about the learning of subject-verb agreement in humans and neural networks

Claire Hobbs +1
cs.CV 2026-05-19 reviewed

AI models lag behind text-only on 3D brain MRI benchmark
NeuroQA: A Large-Scale Image-Grounded Benchmark for 3D Brain MRI Understanding

Mohammad H. Abbasi +14

5 Piths
cs.LG 2026-05-19 reviewed

Compact neural net edges FIB-4 on advanced MASLD fibrosis detection
Machine-Learning-Enhanced Non-Invasive Testing for MASLD Fibrosis: Shallow-Deep Neural Networks Versus FIB-4, Tabular Foundation Models, and Large Language Models

Athanasios Angelakis +3
cs.AI 2026-05-19 reviewed

AI agent ships iOS app with one fix
Open-World Evaluations for Measuring Frontier AI Capabilities

Sayash Kapoor +17
cs.SD 2026-05-19 reviewed

Latent-space attacks survive audio codec compression
Codec-Robust Attacks on Audio LLMs

Jaechul Roh +3
cs.SD 2026-05-19 reviewed

Latent-space attacks survive codec compression on audio LLMs
Codec-Robust Attacks on Audio LLMs

Jaechul Roh +3
cs.CV 2026-05-19 reviewed

Dataset pairs building models with shade maps for urban heat studies
ShadeBench: A Benchmark Dataset for Building Shade Simulation in Sustainable Society

Longchao Da +5
cs.LG 2026-05-19 reviewed

Min-gate fuses diffusion models to catch all four OOD shifts
Tippett-minimum Fusion of Representation-space Diffusion Models for Multi-Encoder Out-of-Distribution Detection

Neelkamal Bhuyan
cs.AI 2026-05-19 reviewed

New metrics score uncertainty-augmented systems as one proper rule
ECUAS$_n$: A family of metrics for principled evaluation of uncertainty-augmented systems

Lautaro Estienne +4
cs.AI 2026-05-19 reviewed

ECUAS_n metrics score uncertainty-augmented systems with one tunable rule
ECUAS$_n$: A family of metrics for principled evaluation of uncertainty-augmented systems

Lautaro Estienne +4
cs.LG 2026-05-19 reviewed

Trained reflectors improve language agents on new tasks
Training Language Agents to Learn from Experience

Yuval Shalev +2
cs.SE 2026-05-19 reviewed

Code gen picks winner by clustering behaviors on auto-generated inputs
Code Generation by Differential Test Time Scaling

Yifeng He +4
cs.CV 2026-05-19 reviewed

Projection equivariance lifts CBCT-to-CT PSNR by 7 dB
EPC-3D-Diff: Equivariant Physics Consistent Conditional 3D Latent Diffusion for CBCT to CT Synthesis

Alzahra Altalib +5
cs.AI 2026-05-19 reviewed

Triplet loss creates high-quality embeddings for Horn logic
High Quality Embeddings for Horn Logic Reasoning

Yifan Zhang +6
cs.CV 2026-05-19 reviewed

Deep learning segments COVID lesions in CT with high accuracy
Pixel Wised Lesion Prediction on COVID-19 CT Imagery: A Comparative Analysis of Automated Image Segmentation Architectures

Sarmad Khan +3
cs.SE 2026-05-19 reviewed

Agentic AI coding improves with structured verification loops
Agentic Agile-V: From Vibe Coding to Verified Engineering in Software and Hardware Development

Christopher Koch
cs.LG 2026-05-19 reviewed

Linear probes on frozen LLMs forecast time series without supervision
LLM Pretraining Shapes a Generalizable Manifold: Insights into Cross-Modal Transfer to Time Series

Alexis Roger +6
cs.CV 2026-05-19 reviewed

ResNet and VGG hit 95-98 percent accuracy on COVID lung scans
A Comprehensive Comparison of Deep Learning Architectures for COVID-19 Classification on CT & X-ray Imagery

Sarmad Khan +3
cs.HC 2026-05-19 reviewed

AI agents form distinct emotional signatures on Moltbook
Modeling Emotional Dynamics in Agent-to-Agent Interactions on Moltbook

Syed Mhamudul Hasan +1
cs.LG 2026-05-19 reviewed

Weight decay separates memorization
Weight Decay Regimes in Grokking Transformers: Cheap Online Diagnostics

Lucky Verma
cs.LG 2026-05-19 reviewed

Tensor algebra recovers angular-momentum rules from molecules alone
Group-Algebraic Tensors: Provably-optimal Equivariant Learning and Physical Symmetry Discovery

Paulina Hoyos +7
cs.AI 2026-05-19 reviewed

Routing weights produce hierarchical attributions at zero cost
BOHM: Zero-Cost Hierarchical Attribution for Compound AI Systems

Joss Armstrong