archive

Every paper Pith has read. Search by title, abstract, or pith.

14513 papers in cs.AI · page 3

cs.CL 2026-05-21 reviewed

BERT classifier labels 55k Ming-Qing letters from title lists
A Fine-Tuned BERT Classifier for Personal-Letter Titles in Late-Ming and Early-Qing Collected Works

Queenie Luo
eess.IV 2026-05-21 reviewed

Synthetic MRIs raise accuracy for one tumour classifier by 1.02%
Do Synthetic Brain MRIs Reliably Improve Tumour Classification? A StyleGAN2-ADA Class-Plane Augmentation Study on BRISC 2025

Jos\'e Rafael Noriega Cede\~no
cs.SE 2026-05-21 reviewed

All seven LLMs generate vulnerable code in developer-like tests
Security of LLM-generated Code: A Comparative Analysis

Srivathsan G Morkonda +2
cs.LG 2026-05-21 reviewed

Jacobian penalty on latent dynamics raises sample efficiency in DreamerV3
Dreaming Smoothly and Sample Efficiently with Gradient Penalized Latent Dynamics

Romil V. Sonigra (1) +1
stat.ML 2026-05-21 reviewed

KAN estimator converges independent of covariate dimension
KAPLAN: Kolmogorov-Arnold Prognostic Learnable Activation Networks for Survival Analysis

Stelios Boulitsakis Logothetis +2
cs.AI 2026-05-21 reviewed

Marker calibration shortens reasoning paths
PathCal: State-Aware Reflection-Marker Calibration for Efficient Reasoning

Lingyu Jiang +8
cs.CV 2026-05-21 reviewed

Dithering defends vision models against adversarial attacks
Dithering Defense: Adversarial Robustness of Vision Foundation Models via Multi-Level Floyd-Steinberg Dithering

Yury Belousov +3
cs.LG 2026-05-21 reviewed

One config matches tuned AdamW across 1-8x horizons on LLMs
Anytime Training with Schedule-Free Spectral Optimization

Anuj Apte +4
cs.SE 2026-05-21 reviewed

Kubernetes agent framework shows retrieval yields only partial falsification
A measurement substrate for agentic Kubernetes operations: Methodology and a case study in retrieval-compounding falsification

Joshua Odmark +2
cs.NI 2026-05-21 reviewed

DQN cuts latency for VR in 6G O-RAN slices
DRL-Driven Edge-Aware Utility Optimization for Multi-Slice 6G Networks

Khaled M. Naguib +4
cs.LG 2026-05-21 reviewed

Recognition of evaluations depends on model-benchmark pairs
Decomposing and Measuring Evaluation Awareness

Changling Li +5
cs.CL 2026-05-21 reviewed

Compositionality rises then falls in LLM self-training
Model Collapse as Cultural Evolution

Dongxin Guo +2
cs.CL 2026-05-21 reviewed

RAG method leads in mental health improvement detection
DreamerNLplus: Interpretable Modeling of Mental Health Dynamics from Social Media Timelines using Hybrid Rule-Based and RAG Methods

Maryia Zhyrko +3
cs.CV 2026-05-21 reviewed

Motion data alone rivals video models trained on 10000x more examples
The TIME Machine: On The Power of Motion for Efficient Perception

Mantas Skackauskas +2
cs.CL 2026-05-21 reviewed

LLMs learn what not to say via frequency competition
Do Language Models Know What Not to Say? Causal Evidence for Statistical Preemption in LLMs

Dongxin Guo +2
cs.CL 2026-05-21 reviewed

SAE features from LLMs map onto brain semantic regions
Sparse Autoencoders Map Brain-LLM Alignment onto Cortical Semantic Topography

Dongxin Guo +2
cs.LG 2026-05-21 reviewed

Intermediate layers hold more task info than final layers
Uncovering the Latent Potential of Deep Intermediate Representations

Arnesh Batra +4
cs.CL 2026-05-21 reviewed

Training data language, not English, drives brain-LLM alignment
Brain-LLM Alignment Tracks Training Data, Not Typology

Dongxin Guo +2
cs.AI 2026-05-21 reviewed

Transformers have fixed accuracy limits set by layers and width
The Deterministic Horizon: Impossibility Results as Design Specifications for Trustworthy AI Systems

Dongxin Guo
q-fin.TR 2026-05-21 reviewed

LLM evolutionary optimizer boosts Bitcoin trading in backtests
MadEvolve: Evolutionary Optimization of Trading Systems with Large Language Models

Yurii Kvasiuk +3
cs.CY 2026-05-21 reviewed

AI for social good omits local context most for institutions
Whose Good, Whose Place? The Moral Geography of Agentic AI for Social Good

Poli Nemkova +2
cs.CL 2026-05-21 reviewed

Proactive AI questions uncover 82% of autism language traits
A Proactive Multi-Agent Dialogue Framework for Assessing Social Language Disorder Traits in Autism

Chuanbo Hu +6
cs.RO 2026-05-21 reviewed

Robots detect underspecified features via demo variation and query for fixes
Robots That Know What to Ask: Recovering Misaligned Rewards through Targeted Explanations

Helena Merker +2
cs.LG 2026-05-21 reviewed

Test-time training raises jailbreak success rates to 95%
Test-Time Training Undermines Safety Guardrails

Simone Antonelli +2
cs.CL 2026-05-21 reviewed

FIM pretraining yields linear verbatim memorization growth
Memorization Dynamics of Fill-in-the-Middle Pretraining

Tobias von Arx +1
cs.SE 2026-05-21 reviewed

LLM code smells found in 73.5% of analyzed systems
LLM Code Smells: A Taxonomy and Detection Approach

Zacharie Chenail-Larcher +4
cs.LG 2026-05-21 reviewed

Random Feature Selection Outperforms Many State-of-the-Art Methods
Worse than Random: The Importance of a Baseline for Unsupervised Feature Selection

Muhammad Rajabinasab +3
cs.LG 2026-05-21 reviewed

Models balance rules and exceptions only under specific geometries
A mathematical theory of balancing relational generalization and memorization

Luke Cheng +1
cs.CL 2026-05-21 reviewed

Graph alignment detects LLM hallucinations better than GPT-4o
Graph Alignment Topology as an Inductive Bias for Grounding Detection

Paul Landes +3
cs.LG 2026-05-21 reviewed

Entropy regularization needs non-degenerate information forces to work
Human-Centered Learning Mechanics: A Dynamical Framework for Entropy-Regulated Representation Learning

Kim Phuc Tran
cs.LG 2026-05-21 reviewed

Vector rewards produce diverse LLM outputs that raise search scores
Vector Policy Optimization: Training for Diversity Improves Test-Time Search

Ryan Bahlous-Boldi +8
cs.LG 2026-05-21 reviewed

The Matching Principle: A Geometric Theory of Loss Functions for Nuisance-Robust Representation Learning
Vishal Rajput
stat.ML 2026-05-21 reviewed

Kernel density gradients yield conservative drifting at rate N^{-1/(d+4)}
Finite-Particle Convergence Rates for Conservative and Non-Conservative Drifting Models

Krishnakumar Balasubramanian
cs.AI 2026-05-21 reviewed

Agents boost scores by rewriting their own code
MOSS: Self-Evolution through Source-Level Rewriting in Autonomous Agent Systems

Qianshu Cai +7
cs.AI 2026-05-21 reviewed

Evidence verifier scores spans by accuracy gain in self-evolving agents
EVE-Agent: Evidence-Verifiable Self-Evolving Agents

Yamato Arai +1
cs.CV 2026-05-21 reviewed

Metro suicide risk scored from video by tracking and heatmaps
Suicide Risk Assessment from AI-powered Video Surveillance: An Interpretable Framework for Prevention in Metro Stations

Safwen Naimi +3
cs.AI 2026-05-21 reviewed

Separate erase and write gates lift linear attention on long contexts
Gated DeltaNet-2: Decoupling Erase and Write in Linear Attention

Ali Hatamizadeh +2
cs.AI 2026-05-21 reviewed

KV cache guard cuts reconstruction leaks in multi-agent LLMs
LCGuard: Latent Communication Guard for Safe KV Sharing in Multi-Agent Systems

Sadia Asif +4
cs.OS 2026-05-21 reviewed

DeltaBox cuts AI agent checkpoint and rollback to 14 ms and 5 ms
DeltaBox: Scaling Stateful AI Agents with Millisecond-Level Sandbox Checkpoint/Rollback

Yunpeng Dong +9
cs.CV 2026-05-21 reviewed

VLMs keep high scores after most image tokens are deleted
Seeing without Looking: Do Vision-Language Benchmarks Really Test Vision?

Zixuan Lan +3
cs.LG 2026-05-21 reviewed

Transcoders trace VLM grounding and predict hallucinations at 0.68 AUC
Transcoders Trace Visual Grounding and Hallucinations in Vision-Language Models

Dimitrios Damianos +4
cs.LG 2026-05-21 reviewed

Diffusion model generates continuous survival times from censored data
SDPM: Survival Diffusion Probabilistic Model for Continuous-Time Survival Analysis

Stanislav R. Kirpichenko +2
cs.LG 2026-05-21 reviewed

Mamba model hits 76.8% accuracy on eye-gaze cognitive load
MambaGaze: Bidirectional Mamba with Explicit Missing Data Modeling for Cognitive Load Assessment from Eye-Gaze Tracking Data

Amir Mousavi +7
cs.LG 2026-05-21 reviewed

ECG foundation models adapt to wearables for cognitive load
CogAdapt: Transferring Clinical ECG Foundation Models to Wearable Cognitive Load Assessment via Lead Adaptation

Amir Mousavi +7
cs.AI 2026-05-21 reviewed

RL agent outperforms fixed rules for job shops with random arrivals
Deep Reinforcement Learning for Flexible Job Shop Scheduling with Random Job Arrivals

Yu Tang +4
cs.CL 2026-05-21 reviewed

Consistency training cuts covert political bias in LLMs
Reducing Political Manipulation with Consistency Training

Long Phan +5
cs.CL 2026-05-21 reviewed

Time-ordered training keeps LLM facts fresher than shuffling
Understanding Data Temporality Impact on Large Language Models Pre-training

Hippolyte Pilchen +4
cs.AI 2026-05-21 reviewed

Mediative connective extends fuzzy logic soundly to quantum level
Mediative Fuzzy Logic: From Type-1 Foundations to Type-2, Type-3 and Quantum Extensions

Oscar Montiel Ross
cs.AI 2026-05-21 reviewed

AI agent solves 9 open Erdős problems via Lean proofs
Advancing Mathematics Research with AI-Driven Formal Proof Search

George Tsoukalas +20
cs.AI 2026-05-21 reviewed

Trillion-minute pretraining improves wearable health predictions
Towards a General Intelligence and Interface for Wearable Health Data

Girish Narayanswamy +39