archive
Every paper Pith has read. Search by title, abstract, or pith.
14513 papers in cs.AI · page 3
-
BERT classifier labels 55k Ming-Qing letters from title lists
A Fine-Tuned BERT Classifier for Personal-Letter Titles in Late-Ming and Early-Qing Collected Works
-
Synthetic MRIs raise accuracy for one tumour classifier by 1.02%
Do Synthetic Brain MRIs Reliably Improve Tumour Classification? A StyleGAN2-ADA Class-Plane Augmentation Study on BRISC 2025
-
All seven LLMs generate vulnerable code in developer-like tests
Security of LLM-generated Code: A Comparative Analysis
-
Jacobian penalty on latent dynamics raises sample efficiency in DreamerV3
Dreaming Smoothly and Sample Efficiently with Gradient Penalized Latent Dynamics
-
KAN estimator converges independent of covariate dimension
KAPLAN: Kolmogorov-Arnold Prognostic Learnable Activation Networks for Survival Analysis
-
Marker calibration shortens reasoning paths
PathCal: State-Aware Reflection-Marker Calibration for Efficient Reasoning
-
Dithering defends vision models against adversarial attacks
Dithering Defense: Adversarial Robustness of Vision Foundation Models via Multi-Level Floyd-Steinberg Dithering
-
One config matches tuned AdamW across 1-8x horizons on LLMs
Anytime Training with Schedule-Free Spectral Optimization
-
Kubernetes agent framework shows retrieval yields only partial falsification
A measurement substrate for agentic Kubernetes operations: Methodology and a case study in retrieval-compounding falsification
-
DQN cuts latency for VR in 6G O-RAN slices
DRL-Driven Edge-Aware Utility Optimization for Multi-Slice 6G Networks
-
Recognition of evaluations depends on model-benchmark pairs
Decomposing and Measuring Evaluation Awareness
-
Compositionality rises then falls in LLM self-training
Model Collapse as Cultural Evolution
-
RAG method leads in mental health improvement detection
DreamerNLplus: Interpretable Modeling of Mental Health Dynamics from Social Media Timelines using Hybrid Rule-Based and RAG Methods
-
Motion data alone rivals video models trained on 10000x more examples
The TIME Machine: On The Power of Motion for Efficient Perception
-
LLMs learn what not to say via frequency competition
Do Language Models Know What Not to Say? Causal Evidence for Statistical Preemption in LLMs
-
SAE features from LLMs map onto brain semantic regions
Sparse Autoencoders Map Brain-LLM Alignment onto Cortical Semantic Topography
-
Intermediate layers hold more task info than final layers
Uncovering the Latent Potential of Deep Intermediate Representations
-
Training data language, not English, drives brain-LLM alignment
Brain-LLM Alignment Tracks Training Data, Not Typology
-
Transformers have fixed accuracy limits set by layers and width
The Deterministic Horizon: Impossibility Results as Design Specifications for Trustworthy AI Systems
-
LLM evolutionary optimizer boosts Bitcoin trading in backtests
MadEvolve: Evolutionary Optimization of Trading Systems with Large Language Models
-
AI for social good omits local context most for institutions
Whose Good, Whose Place? The Moral Geography of Agentic AI for Social Good
-
Proactive AI questions uncover 82% of autism language traits
A Proactive Multi-Agent Dialogue Framework for Assessing Social Language Disorder Traits in Autism
-
Robots detect underspecified features via demo variation and query for fixes
Robots That Know What to Ask: Recovering Misaligned Rewards through Targeted Explanations
-
Test-time training raises jailbreak success rates to 95%
Test-Time Training Undermines Safety Guardrails
-
FIM pretraining yields linear verbatim memorization growth
Memorization Dynamics of Fill-in-the-Middle Pretraining
-
LLM code smells found in 73.5% of analyzed systems
LLM Code Smells: A Taxonomy and Detection Approach
-
Random Feature Selection Outperforms Many State-of-the-Art Methods
Worse than Random: The Importance of a Baseline for Unsupervised Feature Selection
-
Models balance rules and exceptions only under specific geometries
A mathematical theory of balancing relational generalization and memorization
-
Graph alignment detects LLM hallucinations better than GPT-4o
Graph Alignment Topology as an Inductive Bias for Grounding Detection
-
Entropy regularization needs non-degenerate information forces to work
Human-Centered Learning Mechanics: A Dynamical Framework for Entropy-Regulated Representation Learning
-
Vector rewards produce diverse LLM outputs that raise search scores
Vector Policy Optimization: Training for Diversity Improves Test-Time Search
-
-
Kernel density gradients yield conservative drifting at rate N^{-1/(d+4)}
Finite-Particle Convergence Rates for Conservative and Non-Conservative Drifting Models
-
Agents boost scores by rewriting their own code
MOSS: Self-Evolution through Source-Level Rewriting in Autonomous Agent Systems
-
Evidence verifier scores spans by accuracy gain in self-evolving agents
EVE-Agent: Evidence-Verifiable Self-Evolving Agents
-
Metro suicide risk scored from video by tracking and heatmaps
Suicide Risk Assessment from AI-powered Video Surveillance: An Interpretable Framework for Prevention in Metro Stations
-
Separate erase and write gates lift linear attention on long contexts
Gated DeltaNet-2: Decoupling Erase and Write in Linear Attention
-
KV cache guard cuts reconstruction leaks in multi-agent LLMs
LCGuard: Latent Communication Guard for Safe KV Sharing in Multi-Agent Systems
-
DeltaBox cuts AI agent checkpoint and rollback to 14 ms and 5 ms
DeltaBox: Scaling Stateful AI Agents with Millisecond-Level Sandbox Checkpoint/Rollback
-
VLMs keep high scores after most image tokens are deleted
Seeing without Looking: Do Vision-Language Benchmarks Really Test Vision?
-
Transcoders trace VLM grounding and predict hallucinations at 0.68 AUC
Transcoders Trace Visual Grounding and Hallucinations in Vision-Language Models
-
Diffusion model generates continuous survival times from censored data
SDPM: Survival Diffusion Probabilistic Model for Continuous-Time Survival Analysis
-
Mamba model hits 76.8% accuracy on eye-gaze cognitive load
MambaGaze: Bidirectional Mamba with Explicit Missing Data Modeling for Cognitive Load Assessment from Eye-Gaze Tracking Data
-
ECG foundation models adapt to wearables for cognitive load
CogAdapt: Transferring Clinical ECG Foundation Models to Wearable Cognitive Load Assessment via Lead Adaptation
-
RL agent outperforms fixed rules for job shops with random arrivals
Deep Reinforcement Learning for Flexible Job Shop Scheduling with Random Job Arrivals
-
Consistency training cuts covert political bias in LLMs
Reducing Political Manipulation with Consistency Training
-
Time-ordered training keeps LLM facts fresher than shuffling
Understanding Data Temporality Impact on Large Language Models Pre-training
-
Mediative connective extends fuzzy logic soundly to quantum level
Mediative Fuzzy Logic: From Type-1 Foundations to Type-2, Type-3 and Quantum Extensions
-
AI agent solves 9 open Erdős problems via Lean proofs
Advancing Mathematics Research with AI-Driven Formal Proof Search
-
Trillion-minute pretraining improves wearable health predictions
Towards a General Intelligence and Interface for Wearable Health Data