archive
Every paper Pith has read. Search by title, abstract, or pith.
14513 papers in cs.AI · page 9
-
Inductive logic turns neural circuit findings into transferable theories
From Circuit Evidence to Mechanistic Theory: An Inductive Logic Approach
-
LLMs follow logical rules for conditionals but miss human implications
Tracing the ongoing emergence of human-like reasoning in Large Language Models
-
Semantic route cuts mental health prediction error across datasets
TimeSRL: Generalizable Time-Series Behavioral Modeling via Semantic RL-Tuned LLMs -- A Case Study in Mental Health
-
Large learning rates alter transformer attractors to cycles and chaos
Large-Step Training Dynamics of a Two-Factor Linear Transformer Model
-
One-step MeanFlow policies beat Gaussian baselines in RL
Stochastic MeanFlow Policies: One-Step Generative Control with Entropic Mirror Descent
-
One-step generative policies add multimodal actions to mirror descent RL
Stochastic MeanFlow Policies: One-Step Generative Control with Entropic Mirror Descent
-
105M open image-text pairs train competitive text-to-image model
MONET: A Massive, Open, Non-redundant and Enriched Text-to-image dataset
-
Moderate warm-up lets offline DPO surpass online RL on math reasoning
How Much Online RL is Enough? Informative Rollouts for Offline Preference Optimization in RLVR
-
Structural latent points raise robotic task success rates
Learning Structural Latent Points for Efficient Visual Representations in Robotic Manipulation
-
Strategy-map DAG keeps self-evolving agents from repeating old routines
APEX: Autonomous Policy Exploration for Self-Evolving LLM Agents
-
Region-aware VAE completes full heart motion cycle from single frame
RePCM: Region-Specific and Phenotype-Adaptive Bi-Ventricular Cardiac Motion Synthesis
-
Octahedral triplet quantizer trims KV cache bits
OCTOPUS: Optimized KV Cache for Transformers via Octahedral Parametrization Under optimal Squared error quantization
-
Preference tuning cuts RL policy failures by over 60%
PREFINE: Preference-Based Implicit Reward and Cost Fine-Tuning for Safety Alignment
-
AI Automates Every Stage of Microwave Photonics Systems
Artificial Intelligence Reshapes Microwave Photonics
-
QED cuts cross-run divergence in RL by two orders of magnitude
Behavior-Consistent Deep Reinforcement Learning
-
QED makes RL policies 100 times more consistent across runs
Behavior-Consistent Deep Reinforcement Learning
-
Quantum RL matches classical on chemical flowsheet design
Enhanced Reinforcement Learning-based Process Synthesis via Quantum Computing
-
New benchmark finds naive baselines hard to beat on social media sentiment
SURGE: An Event-Centric Social Media Sentiment Time Series Benchmark with Interaction Structure
-
Hardware load balancing keeps AI networks at 98% line rate
High-speed Networking for Giga-Scale AI Factories
-
SAM3 turns rough maps into sharp bacteria explanations
SAM-Sode: Towards Faithful Explanations for Tiny Bacteria Detection
-
Manga109 revised to correct 29,000 dialogue annotations
Manga109-v2026: Revisiting Manga109 Annotations for Modern Manga Understanding
-
Adaptive batch scaling unlocks large-batch RL
Scalable Reinforcement Learning via Adaptive Batch Scaling
-
Boundary-band generator lifts AV collision rates 6.2 points
ScenePilot: Controllable Boundary-Driven Critical Scenario Generation for Autonomous Driving
-
Weierstrass function supplies 2D patch encodings for vision transformers
Weierstrass Positional Encoding for Vision Transformers
-
YOLOv11 detects military targets in synthetic thermal and night drone images
Comparative Analysis of Military Detection Using Drone Imagery Across Multiple Visual Spectrums
-
Fine-tuned LLM reaches 0.866 F1 on Spanish psychiatric ICD coding
Automated ICD Classification of Psychiatric Diagnoses: From Classical NLP to Large Language Models
-
Spectral distances flag Trojaned DNN updates after one step
Detecting Trojaned DNNs via Spectral Regression Analysis
-
Complexity results proven for entailment in cumulative dependence logics
On the Complexity of Entailment for Cumulative Propositional Dependence Logics
-
Parallel Monte Carlo trains deep state space models 10x faster
Efficient Learning of Deep State Space Models via Importance Smoothing
-
Small classifier beats LLMs at pulling exact text from papers
ACL-Verbatim: hallucination-free question answering for research
-
Decoupled messages sustain MARL performance at low bandwidth
Decoupling Communication from Policy: Robust MARL under Bandwidth Constraints
-
Distilling LLM agents yields RPA code that cuts token use 82-96%
AutoRPA: Efficient GUI Automation through LLM-Driven Code Synthesis from Interactions
-
New benchmark separates retrieval from generation errors in legal RAG
Fine-grained Claim-level RAG Benchmark for Law
-
ClaimRAG-LAW benchmark separates retrieval and generation errors in legal RAG
Fine-grained Claim-level RAG Benchmark for Law
-
New dataset separates retrieval from generation in legal RAG
Fine-grained Claim-level RAG Benchmark for Law
-
0.5B driving model matches 7B models by adding future visual states
Grounding Driving VLA via Inverse Kinematics
-
Vector quantization builds local calibration maps for multiclass models
Divide et Calibra: Multiclass Local Calibration via Vector Quantization
-
Local boundary finds valid adjustment sets for causal effects
Local Covariate Selection for Average Causal Effect Estimation without Pretreatment and Causal Sufficiency Assumptions
-
Dynamic sinks raise dynamic degree in long video generation
DySink: Dynamic Frame Sinks for Autoregressive Long Video Generation
-
Agent turns natural language into governed enterprise API calls
Beyond Text-to-SQL: An Agentic LLM System for Governed Enterprise Analytics APIs
-
Off-the-shelf persona vectors rival targeted sycophancy steering
Playing Devil's Advocate: Off-the-Shelf Persona Vectors Rival Targeted Steering for Sycophancy
-
DABS cuts multi-aspect sentiment computation by up to 60%
Single-Pass, Depth-Selective Reading for Multi-Aspect Sentiment Analysis
-
Landsat addition cuts TanDEM-X forest height RMSE by 13.5%
Hybrid Machine Learning Model for Forest Height Estimation from TanDEM-X and Landsat Data
-
Anchor regularization makes LLM safety consistent across prompt variations
Towards Context-Invariant Safety Alignment for Large Language Models
-
Flat minima enable non-vacuous bounds for transformers on sparse boolean tasks
A Sharper Picture of Generalization in Transformers
-
Routing imbalance in MoE stays fixed when expert parallelism scales
Diagnosing Overhead in Dispatch Operations: Cross-architecture Observatory
-
VGG16 detects fake images at 91% accuracy
Comparative Evaluation of Deep Learning Models for Fake Image Detection
-
Refusal rate misranks LLMs on bio safety
RefusalBench: Why Refusal Rate Misranks Frontier LLMs on Biological Research Prompts
4 Piths -
Layer attention gaps reveal fix for LVLM hallucinations
Finding the Correct Visual Evidence Without Forgetting: Mitigating Hallucination in LVLMs via Inter-Layer Visual Attention Discrepancy
-
Focus-then-context method trims VLM tokens to 22% with tiny accuracy cost
Focus-then-Context: Subject-Centric Progressive Visual Token Reduction for Vision-Language Models