pith. sign in

archive

Every paper Pith has read. Search by title, abstract, or pith.

14513 papers in cs.AI · page 10

  1. cs.LG 2026-05-20 reviewed
    DASH discovers strong hybrid attention for LLMs in 20 minutes on one GPU

    DASH: Fast Differentiable Architecture Search for Hybrid Attention in Minutes on a Single GPU

    Weizhe Chen +5

  2. cs.CL 2026-05-20 reviewed
    Strategy induction from questions alone improves LLM task instructions

    Strategy-Induct: Task-Level Strategy Induction for Instruction Generation

    Po-Chun Chen +2

  3. cs.LO 2026-05-20 reviewed
    Vector-clock monitor matches causal-guard semantics locally

    Causal Past Logic for Runtime Verification of Distributed LLM Agent Workflows

    Benedikt Bollig

  4. cs.LG 2026-05-20 reviewed
    Oscillatory network scales to ImageNet with high efficiency

    Winfree Oscillatory Neural Network

    Jiawen Dai +1

  5. cs.LG 2026-05-20 reviewed
    One program decodes bundles at 100% on four frozen embeddings

    Sutra: Tensor-Op RNNs as a Compilation Target for Vector Symbolic Architectures

    Emma Leonhart

  6. cs.LG 2026-05-20 reviewed
    Sutra compiles VSA programs to tensor graphs with exact decoding

    Sutra: Tensor-Op RNNs as a Compilation Target for Vector Symbolic Architectures

    Emma Leonhart

  7. cs.CL 2026-05-20 reviewed
    Unlearned models keep low calibration but lean on shortcuts

    Calibration vs Decision Making: Revisiting the Reliability Paradox in Unlearned Language Models

    Divyaksh Shukla +1

  8. cs.AI 2026-05-20 reviewed
    Fighting game AIs learn how long to hold each move

    For How Long Should We Be Punching? Learning Action Duration in Fighting Games

    Hoang Hai Nguyen +2

  9. cs.CV 2026-05-20 reviewed
    VISTA wins Ego4D STA challenge by fusing frozen video features into detector

    VISTA: Technical Report for the Ego4D Short-Term Object Interaction Anticipation at EgoVis 2026

    Qiaohui Chu +6

  10. cs.CR 2026-05-20 reviewed
    Agent finds hidden threats in 15% of security incidents

    GenAI-Driven Threat Detection with Microsoft Security Copilot

    Scott Freitas +1

  11. cs.CR 2026-05-20 reviewed
    Agent surfaces novel threats in 15% of security incidents

    GenAI-Driven Threat Detection with Microsoft Security Copilot

    Scott Freitas +1

  12. cs.CR 2026-05-20 reviewed
    Frequency regularization lifts attack transfer to closed MLLMs

    Frequency-Domain Regularized Adversarial Alignment for Transferable Attacks against Closed-Source MLLMs

    Leitao Yuan +7

  13. cs.CL 2026-05-20 reviewed
    Skill synthesis scales terminal-agent data to beat baselines with 1% of it

    Terminal-World: Scaling Terminal-Agent Environments via Agent Skills

    Zihao Cheng +8

  14. cs.AI 2026-05-20 reviewed
    Five checkpoints enforce policy in generalist agents

    Governance by Construction for Generalist Agents

    Segev Shlomov +9

  15. cs.AI 2026-05-20 reviewed
    Taxonomy-based generator yields verifiable planning data for LLMs

    PlanningBench: Generating Scalable and Verifiable Planning Data for Evaluating and Training Large Language Models

    Ziliang Zhao +9

  16. cs.LG 2026-05-20 reviewed
    Gradient moment method cuts 3D Gaussian count by 85-97%

    CAdam: Context-Adaptive Moment Estimation for 3D Gaussian Densification in Generative Distillation

    SeungJeh Chung +3

  17. cs.LG 2026-05-20 reviewed
    Runtime bounds certify quantized KV attention with exact fallback

    Runtime-Certified Bounded-Error Quantized Attention

    Dean Calver

  18. cs.LG 2026-05-20 reviewed
    N-step correction tightens PPO bound for RL with verifiable rewards

    Multi-Step Likelihood-Ratio Correction for Reinforcement Learning with Verifiable Rewards

    Deokgyu Yoon +6

  19. cs.SI 2026-05-20 reviewed
    Multi-metric score spots synthetic narratives more reliably

    Detecting Synthetic Political Narratives in Cross-Platform Social Media Discourse

    Despoina Antonakaki +1

  20. cs.RO 2026-05-20 reviewed
    Hypernetwork generates full robot policies from instructions alone

    DISC: Decoupling Instruction from State-Conditioned Control via Policy Generation

    Hanxiang Ren +3

  21. cs.CV 2026-05-20 reviewed
    224K short videos collected by labels support semantic benchmarks

    USV: Towards Understanding the User-generated Short-form Videos

    Haoyue Cheng +5

  22. cs.CV 2026-05-20 reviewed
    New benchmark shows VLMs lag trained humans on building layouts

    ArchSIBench: Benchmarking the Architectural Spatial Intelligence of Vision-Language Models

    Qirui Shen +7

  23. cs.AI 2026-05-20 reviewed
    DPO matches RLHF only if optimal policy favors human responses

    Conditional Equivalence of DPO and RLHF: Implicit Assumption, Failure Modes, and Provable Alignment

    Zhiqin Yang +5

  24. cs.CL 2026-05-20 reviewed
    7B open LLMs run GraphRAG locally for EHR schema queries

    GraphRAG on Consumer Hardware: Benchmarking Local LLMs for Healthcare EHR Schema Retrieval

    Peter Fernandes +1

  25. cs.LG 2026-05-20 reviewed
    Preference vector tunes task balance in merged continual learning models

    Tunable MAGMAX: Preference-Aware Model Merging for Continual Learning

    Kei Hiroshima +2

  26. cs.AR 2026-05-20 reviewed
    ELSA gives spiking networks 3.4x faster inference than top accelerators

    ELSA: An ELastic SNN Inference Architecture for Efficient Neuromorphic Computing

    Kang You +8

  27. cs.AI 2026-05-20 reviewed
    Local writes accumulate into global solutions in recursive reasoners

    Interaction Locality in Hierarchical Recursive Reasoning

    Yosuke Miyanishi +1

  28. cs.AI 2026-05-20 reviewed
    New guidance resolves gradient conflicts in flow models

    Conflict-Aware Additive Guidance for Flow Models under Compositional Rewards

    Xuehui Yu +4

  29. cs.LG 2026-05-20 reviewed
    Bias correction cuts pretraining loss in AdamW and similar optimizers

    Correcting Stochastic Update Bias in Preconditioned Language Model Optimizers

    Nikhil Nayak +9

  30. cs.LG 2026-05-20 reviewed
    Distillation from richer pseudo-samples improves sparse glucose estimates

    PACD-Net: Pseudo-Augmented Contrastive Distillation for Glycemic Control Estimation from SMBG

    Canyu Lei +2

  31. cs.LG 2026-05-20 reviewed
    GLU shrinks NTK condition number for faster convergence

    The Devil is in the Condition Numbers: Why is GLU Better than non-GLU Structure?

    Xingyu Lyu +4

  32. cs.LG 2026-05-20 reviewed
    Hidden states at paragraph boundaries tune verifier strictness

    The Hidden Signal of Verifier Strictness: Controlling and Improving Step-Wise Verification via Selective Latent Steering

    Yefan Zhou +5

  33. cs.LG 2026-05-20 reviewed
    Testbed embeds detectable hacks for automatic reward-gaming checks

    Hack-Verifiable Environments: Towards Evaluating Reward Hacking at Scale

    Amit Roth +4

  34. cs.AI 2026-05-20 reviewed
    Text modeling of EV battery signals enables LLM fault diagnosis

    VBFDD-Agent for Electric Vehicle Battery Fault Detection and Diagnosis: Descriptive Text Modeling of Battery Digital Signals

    Joey Chan +2

  35. cs.LG 2026-05-20 reviewed
    RL scores full distributions to fix LLM regression

    Distribution-Aware Reward: Reinforcement Learning over Predictive Distributions for LLM Regression

    Jungsoo Park +6

  36. cs.CR 2026-05-20 reviewed
    Monitor reduces LLM agent covert channels to zero capacity

    An Application-Layer Multi-Modal Covert-Channel Reference Monitor for LLM Agent Egress

    Alfredo Metere

  37. cs.CV 2026-05-20 reviewed
    Designer ratings dataset lifts AI graphic scorer to 0.611 agreement

    TASTE: A Designer-Annotated Multi-Dimensional Preference Dataset for AI-Generated Graphic Design

    Haonan Zhu +4

  38. cs.CL 2026-05-20 reviewed
    Aligning task vectors to in-context next-token distributions lifts accuracy 9.2%

    Distributional Alignment as a Criterion for Designing Task Vectors in In-Context Learning

    Jihoon Kwon +2

  39. cs.LG 2026-05-20 reviewed
    Group statistics adapt clipping and temperature to lift LLM math scores

    AGPO: Adaptive Group Policy Optimization with Dual Statistical Feedback

    Miaobo Hu +7

  40. cs.CV 2026-05-20 reviewed
    SAVER selectively activates vision to boost F1 and cut latency in multimodal IE

    SAVER: Selective As-Needed Vision Evidence for Multimodal Information Extraction

    Miaobo Hu +7

  41. cs.CL 2026-05-20 reviewed
    Categorical error rates beat WER for Indic speech recognition

    SCRIBE: Diagnostic Evaluation and Rich Transcription Models for Indic ASR

    Kavya Manohar +3

  42. cs.CV 2026-05-20 reviewed
    DAR cuts DiT training iterations by 8.75x while improving FID by 2.11

    Rethinking Cross-Layer Information Routing in Diffusion Transformers

    Chao Xu +11

  43. cs.DC 2026-05-20 reviewed
    WebGPU backend cuts LLM memory use by 29-33% in browsers

    Llamas on the Web: Memory-Efficient, Performance-Portable, and Multi-Precision LLM Inference with WebGPU

    Reese Levine +7

  44. cs.CR 2026-05-20 reviewed
    Heartbeat protocol revokes AI swarm credentials within fixed window

    Heartbeat-Bound Hierarchical Credentials: Cryptographic Revocation for AI Agent Swarms

    Saurabh Deochake

  45. cs.AI 2026-05-20 reviewed
    Agentic system solves 8 of 10 research math problems

    RMA: an Agentic System for Research-Level Mathematical Problems

    Zelin Zhao +3

  46. cs.CL 2026-05-20 reviewed
    Agreement screening yields clearer text features at full accuracy

    Interpretable Discriminative Text Representations via Agreement and Label Disentanglement

    Tong Wang +2

  47. cs.AI 2026-05-20 reviewed
    Typed contracts let agents compose data systems reliably

    Declarative Data Services: Structured Agentic Discovery for Composing Data Systems

    Shanshan Ye +1

  48. cs.CL 2026-05-20 reviewed
    Self-limiting losses compress embeddings without overfitting

    DIVE: Embedding Compression via Self-Limiting Gradient Updates

    Dongfang Zhao

  49. cs.LG 2026-05-20 reviewed
    Dynamic experts cut error on shifting time series

    Dynamic TMoE: A Drift-Aware Dynamic Mixture of Experts Framework for Non-Stationary Time Series Forecasting

    Jiawen Zhu +3

  50. cs.CL 2026-05-20 reviewed
    AI reviewer beats top human on Nature papers

    On the limits and opportunities of AI reviewers: Reviewing the reviews of Nature-family papers with 45 expert scientists

    Seungone Kim +57