archive

Every paper Pith has read. Search by title, abstract, or pith.

14513 papers in cs.AI · page 16

cs.LG 2026-05-19 reviewed

Partial re-noising raises Sudoku accuracy from 56% to 75%
Inference-Time Scaling in Diffusion Models through Iterative Partial Refinement

Taegu Kang +2
cs.RO 2026-05-19 reviewed

Evidence packets keep long-horizon robots aligned on task stages
ContextFlow: Hierarchical Task-State Alignment for Long-Horizon Embodied Agents

Shuhan Guo +6
cs.CV 2026-05-19 reviewed

Gated CNN detects falls on smartwatches without attention
You Don't Need Attention: Gated Convolutional Modeling for Watch-Based Fall Detection

Sana Alamgeer +4
cs.GR 2026-05-19 reviewed

Diffusion model generates polycubes from point clouds for hex meshing
PolycubeNet: A Dual-latent Diffusion Model for Polycube-Based Hexahedral Mesh Generation

Lu He +7
cs.RO 2026-05-19 reviewed

Tuning converts robot delays into VLA performance gains
DEFLECT: Delay-Robust Execution via Flow-matching Likelihood-Estimated Counterfactual Tuning for VLA Policies

Yixiang Zhu +7
cs.LG 2026-05-19 reviewed

Decoupled recursion cuts interference in MLLM edits
Modality-Decoupled Online Recursive Editing

Siyuan Li +3
cs.CL 2026-05-19 reviewed

Metric selects only necessary rationales for LLM misinformation checks
Are Rationales Necessary and Sufficient? Tuning LLMs for Explainable Misinformation Detection

Bing Wang +8
cs.LG 2026-05-19 reviewed

Trajectory selection beats sampling in delayed disambiguation
EviTrack: Selection over Sampling for Delayed Disambiguation

Omer Haq
cs.CL 2026-05-19 reviewed

End-to-end models output formal text straight from Chinese speech
FormalASR: End-to-End Spoken Chinese to Formal Text

Wanyi Ning +5
cs.LG 2026-05-19 reviewed

Small abstract spaces enable RL generalization to larger tasks
Smaller Abstract State Spaces Enable Cross-Scale Generalization in Reinforcement Learning

Nasehatul Mustakim +1
cs.AI 2026-05-19 reviewed

Stake-weighted votes approximate power to ownership in expectation
Swimming with Whales: Analysis of Power Imbalances in Stake-Weighted Governance

Yuzhe Zhang +3
cs.SE 2026-05-19 reviewed

Self-healing web apps detect faults at 90.7% and recover 56% faster
When Web Apps Heal Themselves: A MAPE-K Based Approach to Fault Tolerance and Adaptive Recovery

Sales Aribe Jr +1
cs.AI 2026-05-19 reviewed

Quadtrees cut GUI agent visual tokens by 30 percent
AQuaUI: Visual Token Reduction for GUI Agents with Adaptive Quadtrees

Yuankai Li +5
cs.LG 2026-05-19 reviewed

Python framework unifies XAI methods for ECG models
ExECG: An Explainable AI Framework for ECG models

Jong-Hwan Jang +1
cs.AI 2026-05-19 reviewed

Imbalanced attention heads bias MLLMs toward text errors over visuals
Causal Evidence for Attention Head Imbalance in Modality Conflict Hallucination

Jinrui Jiang +3
cs.LG 2026-05-19 reviewed

Local distance graphs recover global Euclidean embeddings
Euclidean Embedding of Data Using Local Distances

Dimitris Arabadjis
cs.CV 2026-05-19 reviewed

Post-training lifts video models' physical consistency
PhyWorld: Physics-Faithful World Model for Video Generation

Pu Zhao +12
cs.CL 2026-05-19 reviewed

Language access managers accept AI but require human oversight
AI Technologies in Language Access: Attitudes Towards AI and the Human Value of Language Access Managers

Miguel A. Jim\'enez-Crespo +2
cs.AI 2026-05-19 reviewed

Theory-anchored LLM cuts bias in disaster survey gaps
Can Large Language Models Revolutionize Survey Research? Experiments with Disaster Preparedness Responses

Yan Wang +2
cs.CL 2026-05-19 reviewed

Step-level scores flag reasoning errors in closed LLMs
Diagnosing Multi-step Reasoning Failures in Black-box LLMs via Stepwise Confidence Attribution

Xiaoou Liu +5
cs.CR 2026-05-19 reviewed

Single trigger corrupts both text and image outputs in unified models
Token by Token, Compromised: Backdoor Vulnerabilities in Unified Autoregressive Models

Tobias Braun +4
cs.CL 2026-05-19 reviewed

LLM Uncertainty Scores Only Measure Output Consistency
Position: Uncertainty Quantification in LLMs is Just Unsupervised Clustering

Tiejin Chen +3
cs.AI 2026-05-19 reviewed

VLM agents match real A/B test shifts in 77 percent of cases
SimGym: A Framework for A/B Test Simulation in E-Commerce with Traffic-Grounded VLM Agents

Han Li +19
cs.CV 2026-05-19 reviewed

PCA rotation aligns key channels for accurate VLM pruning
Rotation-Aligned Key Channel Pruning for Efficient Vision-Language Model Inference

Beomseok Kang +4
cs.AI 2026-05-19 reviewed

Volatility increases exploration while stochasticity decreases it
Not all uncertainty is alike: volatility, stochasticity, and exploration

Payam Piray
cs.CV 2026-05-19 reviewed

Quantized model cuts brain tumor AI size by 6x with same accuracy
Quantized Machine Learning Models for Medical Imaging in Low-Resource Healthcare Settings

Sumanth Meenan Kanneti +1
cs.RO 2026-05-19 reviewed

RL quadrotor controller enables forest under-canopy inspections
Aerial Inspection Behaviors via RL-based Quadrotor Control for Under-canopy Forest Environments

Fausto Mauricio Lagos Suarez +4
cs.LG 2026-05-19 reviewed

PneumoNet hits 86.6% accuracy with 1.4% forgetting across device shifts
On-Device Continual Learning with Dual-Stage Buffer and Dynamic Loss for Point-of-Care Pneumonia Diagnosis

Danu Kim
cs.AI 2026-05-18 reviewed

Evidence certificates stop agents from unsafe hallucinated actions
Hallucination as Exploit: Evidence-Carrying Multimodal Agents

Guijia Zhang +2
cs.AI 2026-05-18 reviewed

Certified predicates stop hallucination exploits in agents
Hallucination as Exploit: Evidence-Carrying Multimodal Agents

Guijia Zhang +2
cs.CY 2026-05-18 reviewed

Global South red teaming uncovers unique T2I harms
Going PLACES: Participatory Localized Red Teaming for Text-to-Image Safety in the Global South

Charvi Rastogi +15
cs.AI 2026-05-18 reviewed

Agents gain a profile to match KGs by what they can prove
Discoverable Agent Knowledge -- A Formal Framework for Agentic KG Affordances (Extended Version)

Terry R. Payne +2

1 Piths
cs.LG 2026-05-18 reviewed

Action-gap certificate certifies greedy goal reach in sparse planning
Planner-Admissible Graph-PDE Value Extensions for Sparse Goal-Conditioned Planning

Shiheng Zhang
cs.LG 2026-05-18 reviewed

Retrieval memory sharpens forecasts for new delivery zones
Bridge: Retrieval-Augmented Spatiotemporal Modeling for Urban Delivery Demand

Yihong Tang +5
cs.AI 2026-05-18 reviewed

AI agents produce 117 papers but none clear top-tier bar
How Far Are We From True Auto-Research?

Zhengxin Zhang +3
cs.LG 2026-05-18 reviewed

Wrapper gives pathwise risk control for updating LLMs
Conformal Selective Acting: Anytime-Valid Risk Control for RLVR-Trained LLMs

Hamed Khosravi +1

4 Piths
cs.AI 2026-05-18 reviewed

Trust calibration for AI agents as preference learning
Progressive Autonomy as Preference Learning: A Formalization of Trust Calibration for Agentic Tool Use

Changkun Ou
cs.LG 2026-05-18 reviewed

Sparse matrix bank gives SSMs dense-model expressivity
Flash PD-SSM: Memory-Optimized Structured Sparse State-Space Models

Aleksandar Terzi\'c +6
cs.LG 2026-05-18 reviewed

Low-rank bandits recover drifting subspaces from scalar rewards
Catching a Moving Subspace: Low-Rank Bandits Beyond Stationarity

Hamed Khosravi +1

4 Piths
cs.CR 2026-05-18 reviewed

Benign rewriting lifts LLM safety against poisoning by 51 percent
Be Kind, Rewrite: Benign Projections via Rewriting Defend Against LLM Data Poisoning Attacks

John T. Halloran +1
cs.LG 2026-05-18 reviewed

Local attack and support calls stabilize global argument rankings
GRASP: Deterministic argument ranking in interaction graphs

Diganta Misra +3
cs.AI 2026-05-18 reviewed

Neural Q-learning converges with finite-sample bounds in decentralized handoffs
Learning to Hand Off: Provably Convergent Workflow Learning under Interface Constraints

Jiayu Li +4

4 Piths
cs.LG 2026-05-18 reviewed

One model trained on text and time series matches both specialists
Chronicle: A Multimodal Foundation Model for Joint Language and Time Series Understanding

Paul Quinlan +3
cs.RO 2026-05-18 reviewed

Smartphones collect 7500 robot demos in five days
COBALT: Crowdsourcing Robot Learning via Cloud-Based Teleoperation with Smartphones

Ayush Agarwal +8
cs.RO 2026-05-18 reviewed

Smartphone teleop rivals specialized hardware for robot demos
COBALT: Crowdsourcing Robot Learning via Cloud-Based Teleoperation with Smartphones

Ayush Agarwal +8
cs.CV 2026-05-18 reviewed

SSL pretraining helps models know when to skip DR predictions
Knowing When Not to Predict: Self Supervised Learning and Abstention for Safer DR Screening

Muskaan Chopra +3
cs.LG 2026-05-18 reviewed

VLMs need tight data alignment and miss weak signals in egocentric video
EgoBabyVLM: Benchmarking Cross-Modal Learning from Naturalistic Egocentric Video Data

Dongyan Lin +21
cs.AI 2026-05-18 reviewed

Frontier LLMs withhold 99% of protected data in agent tasks
POLAR-Bench: A Diagnostic Benchmark for Privacy-Utility Trade-offs in LLM Agents

Qiaoyuan Zheng +3
cs.NE 2026-05-18 reviewed

Graph diffusion solver reaches 100% feasibility on multi-objective scheduling
GOAL: Graph-based Objective-Aligned Diffusion Solvers for Dynamic Multi-Objective Optimization

Xingyu Li
cs.CV 2026-05-18 reviewed

Diffusion model turns uniform organ maps into realistic PET scans
Generation of Heterogeneous PET Images from Uniform Organ Activity Maps Using a Pretrained Domain-Adapted Diffusion Model

Suya Li +4