archive
Every paper Pith has read. Search by title, abstract, or pith.
14513 papers in cs.AI · page 16
-
Partial re-noising raises Sudoku accuracy from 56% to 75%
Inference-Time Scaling in Diffusion Models through Iterative Partial Refinement
-
Evidence packets keep long-horizon robots aligned on task stages
ContextFlow: Hierarchical Task-State Alignment for Long-Horizon Embodied Agents
-
Gated CNN detects falls on smartwatches without attention
You Don't Need Attention: Gated Convolutional Modeling for Watch-Based Fall Detection
-
Diffusion model generates polycubes from point clouds for hex meshing
PolycubeNet: A Dual-latent Diffusion Model for Polycube-Based Hexahedral Mesh Generation
-
Tuning converts robot delays into VLA performance gains
DEFLECT: Delay-Robust Execution via Flow-matching Likelihood-Estimated Counterfactual Tuning for VLA Policies
-
Decoupled recursion cuts interference in MLLM edits
Modality-Decoupled Online Recursive Editing
-
Metric selects only necessary rationales for LLM misinformation checks
Are Rationales Necessary and Sufficient? Tuning LLMs for Explainable Misinformation Detection
-
Trajectory selection beats sampling in delayed disambiguation
EviTrack: Selection over Sampling for Delayed Disambiguation
-
End-to-end models output formal text straight from Chinese speech
FormalASR: End-to-End Spoken Chinese to Formal Text
-
Small abstract spaces enable RL generalization to larger tasks
Smaller Abstract State Spaces Enable Cross-Scale Generalization in Reinforcement Learning
-
Stake-weighted votes approximate power to ownership in expectation
Swimming with Whales: Analysis of Power Imbalances in Stake-Weighted Governance
-
Self-healing web apps detect faults at 90.7% and recover 56% faster
When Web Apps Heal Themselves: A MAPE-K Based Approach to Fault Tolerance and Adaptive Recovery
-
Quadtrees cut GUI agent visual tokens by 30 percent
AQuaUI: Visual Token Reduction for GUI Agents with Adaptive Quadtrees
-
Python framework unifies XAI methods for ECG models
ExECG: An Explainable AI Framework for ECG models
-
Imbalanced attention heads bias MLLMs toward text errors over visuals
Causal Evidence for Attention Head Imbalance in Modality Conflict Hallucination
-
Local distance graphs recover global Euclidean embeddings
Euclidean Embedding of Data Using Local Distances
-
Post-training lifts video models' physical consistency
PhyWorld: Physics-Faithful World Model for Video Generation
-
Language access managers accept AI but require human oversight
AI Technologies in Language Access: Attitudes Towards AI and the Human Value of Language Access Managers
-
Theory-anchored LLM cuts bias in disaster survey gaps
Can Large Language Models Revolutionize Survey Research? Experiments with Disaster Preparedness Responses
-
Step-level scores flag reasoning errors in closed LLMs
Diagnosing Multi-step Reasoning Failures in Black-box LLMs via Stepwise Confidence Attribution
-
Single trigger corrupts both text and image outputs in unified models
Token by Token, Compromised: Backdoor Vulnerabilities in Unified Autoregressive Models
-
LLM Uncertainty Scores Only Measure Output Consistency
Position: Uncertainty Quantification in LLMs is Just Unsupervised Clustering
-
VLM agents match real A/B test shifts in 77 percent of cases
SimGym: A Framework for A/B Test Simulation in E-Commerce with Traffic-Grounded VLM Agents
-
PCA rotation aligns key channels for accurate VLM pruning
Rotation-Aligned Key Channel Pruning for Efficient Vision-Language Model Inference
-
Volatility increases exploration while stochasticity decreases it
Not all uncertainty is alike: volatility, stochasticity, and exploration
-
Quantized model cuts brain tumor AI size by 6x with same accuracy
Quantized Machine Learning Models for Medical Imaging in Low-Resource Healthcare Settings
-
RL quadrotor controller enables forest under-canopy inspections
Aerial Inspection Behaviors via RL-based Quadrotor Control for Under-canopy Forest Environments
-
PneumoNet hits 86.6% accuracy with 1.4% forgetting across device shifts
On-Device Continual Learning with Dual-Stage Buffer and Dynamic Loss for Point-of-Care Pneumonia Diagnosis
-
Evidence certificates stop agents from unsafe hallucinated actions
Hallucination as Exploit: Evidence-Carrying Multimodal Agents
-
Certified predicates stop hallucination exploits in agents
Hallucination as Exploit: Evidence-Carrying Multimodal Agents
-
Global South red teaming uncovers unique T2I harms
Going PLACES: Participatory Localized Red Teaming for Text-to-Image Safety in the Global South
-
Agents gain a profile to match KGs by what they can prove
Discoverable Agent Knowledge -- A Formal Framework for Agentic KG Affordances (Extended Version)
1 Piths -
Action-gap certificate certifies greedy goal reach in sparse planning
Planner-Admissible Graph-PDE Value Extensions for Sparse Goal-Conditioned Planning
-
Retrieval memory sharpens forecasts for new delivery zones
Bridge: Retrieval-Augmented Spatiotemporal Modeling for Urban Delivery Demand
-
AI agents produce 117 papers but none clear top-tier bar
How Far Are We From True Auto-Research?
-
Wrapper gives pathwise risk control for updating LLMs
Conformal Selective Acting: Anytime-Valid Risk Control for RLVR-Trained LLMs
4 Piths -
Trust calibration for AI agents as preference learning
Progressive Autonomy as Preference Learning: A Formalization of Trust Calibration for Agentic Tool Use
-
Sparse matrix bank gives SSMs dense-model expressivity
Flash PD-SSM: Memory-Optimized Structured Sparse State-Space Models
-
Low-rank bandits recover drifting subspaces from scalar rewards
Catching a Moving Subspace: Low-Rank Bandits Beyond Stationarity
4 Piths -
Benign rewriting lifts LLM safety against poisoning by 51 percent
Be Kind, Rewrite: Benign Projections via Rewriting Defend Against LLM Data Poisoning Attacks
-
Local attack and support calls stabilize global argument rankings
GRASP: Deterministic argument ranking in interaction graphs
-
Neural Q-learning converges with finite-sample bounds in decentralized handoffs
Learning to Hand Off: Provably Convergent Workflow Learning under Interface Constraints
4 Piths -
One model trained on text and time series matches both specialists
Chronicle: A Multimodal Foundation Model for Joint Language and Time Series Understanding
-
Smartphones collect 7500 robot demos in five days
COBALT: Crowdsourcing Robot Learning via Cloud-Based Teleoperation with Smartphones
-
Smartphone teleop rivals specialized hardware for robot demos
COBALT: Crowdsourcing Robot Learning via Cloud-Based Teleoperation with Smartphones
-
SSL pretraining helps models know when to skip DR predictions
Knowing When Not to Predict: Self Supervised Learning and Abstention for Safer DR Screening
-
VLMs need tight data alignment and miss weak signals in egocentric video
EgoBabyVLM: Benchmarking Cross-Modal Learning from Naturalistic Egocentric Video Data
-
Frontier LLMs withhold 99% of protected data in agent tasks
POLAR-Bench: A Diagnostic Benchmark for Privacy-Utility Trade-offs in LLM Agents
-
Graph diffusion solver reaches 100% feasibility on multi-objective scheduling
GOAL: Graph-based Objective-Aligned Diffusion Solvers for Dynamic Multi-Objective Optimization
-
Diffusion model turns uniform organ maps into realistic PET scans
Generation of Heterogeneous PET Images from Uniform Organ Activity Maps Using a Pretrained Domain-Adapted Diffusion Model