pith. sign in

archive

Every paper Pith has read. Search by title, abstract, or pith.

14513 papers in cs.AI · page 15

  1. cs.RO 2026-05-19 reviewed
    Adaptive coaching speeds learning to use robot guide dogs

    CANINE: Coaching Visually Impaired Users for Interactive Navigation with a Robot Guide Dog

    Cunjun Yu +4

  2. cs.AI 2026-05-19 reviewed
    Attention-guided RL raises jailbreak success on reasoning models

    Attention-Guided Reward for Reinforcement Learning-based Jailbreak against Large Reasoning Models

    Zheng Lin +4

  3. cs.CV 2026-05-19 reviewed
    GUI agents reach only 36% success on media editing tasks

    CutVerse: A Compositional GUI Agents Benchmark for Media Post-Production Editing

    Haobo Hu +6

  4. cs.LG 2026-05-19 reviewed
    Finite dynamics samples enforce safety during RL learning

    Sampling-Based Safe Reinforcement Learning

    Luca Vignola +6

  5. cs.LG 2026-05-19 reviewed
    Pre-training boosts time series detection by 375% but not forecasting

    Quantifying the Pre-training Dividend: Generative versus Latent Self-Supervised Learning for Time Series Foundation Models

    Noam Major +2

  6. cs.AI 2026-05-19 reviewed
    Group reward targets keep solution diversity alive in RL reasoning

    Beyond Mode Collapse: Distribution Matching for Diverse Reasoning

    Xiaozhe Li +12

  7. cs.AI 2026-05-19 reviewed
    GUIDE raises ad GMV 4.1% with built-in safety fallback

    Generative Auto-Bidding with Unified Modeling and Exploration

    Mingming Zhang +9

  8. cs.DC 2026-05-19 reviewed
    Predictor accuracy sets exact fault tolerance in Byzantine agreement

    Resilient Byzantine Agreement with Predictions

    Julien Dallot +4

  9. cs.AI 2026-05-19 reviewed
    Selective feedback reweighting lifts multi-turn agent success to 90%

    What and When to Distill: Selective Hindsight Distillation for Multi-Turn Agents

    Xiaozhe Li +8

  10. cs.CV 2026-05-19 reviewed
    Targeted attacks succeed on encoders without knowing the task

    Targeted Downstream-Agnostic Attack

    Zhuxin Lei +2

  11. cs.LG 2026-05-19 reviewed
    Spiking blocks replace Transformer nonlinearities with <1% accuracy drop

    Plug-and-Play Spiking Operators: Breaking the Nonlinearity Bottleneck in Spiking Transformers

    Xinzhe Yuan (1) +6

  12. cs.LG 2026-05-19 reviewed
    Majority vote locks wrong answers after brief correct window in TTRL

    Detecting and Mitigating the Correct-Answer Extinction Window in Test-Time Reinforcement Learning with Majority Voting

    Hongxiang Lin +3

  13. cs.LG 2026-05-19 reviewed
    Model fuses layout and netlist to predict cell delay at 0.92% error

    FusionCell: Cross-Attentive Fusion of Layout Geometry and Netlist Topology for Standard-Cell Performance Prediction

    Haoyi Zhang +4

  14. cs.CV 2026-05-19 reviewed
    Prototype-anchored training halves calibration error in place recognition

    KappaPlace: Learning Hyperspherical Uncertainty for Visual Place Recognition via Prototype-Anchored Supervision

    Maya Yanko +1

  15. cs.CL 2026-05-19 reviewed
    Backtracking fixes dual biases in LLM reasoning distillation

    Backtracking When It Strays: Mitigating Dual Exposure Biases in LLM Reasoning Distillation

    Bing Wang +9

  16. cs.LG 2026-05-19 reviewed
    Output-layer gradient norm gates reuse to cut RLVR samples by 2.93x

    When to Stop Reusing: Dynamic Gradient Gating for Sample-Efficient RLVR

    Yuchun Miao +6

  17. eess.SP 2026-05-19 reviewed
    Pilot-only model beats full-CSI baselines across frequencies

    PilotWiMAE: Pilot-Native Representation Learning for Wireless Channels

    Berkay Guler +2

  18. cs.AI 2026-05-19 reviewed
    Signed graphs let AI agents resolve conflicts for better reasoning

    Conflict-Resilient Multi-Agent Reasoning via Signed Graph Modeling

    Longgang He +3

  19. cs.LG 2026-05-19 reviewed
    Feedback prefixing improves LLM scaling by up to 2.8x efficiency

    Introspective X Training: Feedback Conditioning Improves Scaling Across all LLM Training Stages

    Brandon Cui +9

  20. cs.LG 2026-05-19 reviewed
    ODE paths limit forgetting when merging models sequentially

    Unlocking the Potential of Continual Model Merging: An ODE Perspective

    Lihong Lin +1

  21. cs.LG 2026-05-19 reviewed
    ODE traces low-loss paths for sequential model merging

    Unlocking the Potential of Continual Model Merging: An ODE Perspective

    Lihong Lin +1

  22. cs.LG 2026-05-19 reviewed
    Large models improve with unfiltered low-quality data

    A Bitter Lesson for Data Filtering

    Christopher Mohri +2

  23. cs.CV 2026-05-19 reviewed
    JUDO outperforms GPT-4o on industrial anomaly QA with normal image references

    JUDO: A Juxtaposed Domain-Oriented Multimodal Reasoner for Industrial Anomaly QA

    Hyunju Kang +3

  24. cs.CV 2026-05-19 reviewed
    Rebalancing attention boosts motion in image-to-video models

    Rebalancing Reference Frame Dominance to Improve Motion in Image-to-Video Models

    Wooseok Jeon +5

  25. cs.CV 2026-05-19 reviewed
    Rebalancing attention reduces reference dominance and increases video motion

    Rebalancing Reference Frame Dominance to Improve Motion in Image-to-Video Models

    Wooseok Jeon +5

  26. cs.CV 2026-05-19 reviewed
    Unlearning methods leave class traces in model representations

    Can Vision Models Truly Forget? Mirage: Representation-Level Certification of Visual Unlearning

    Zhenyu Yu +4

  27. cs.CL 2026-05-19 reviewed
    Reassembling entity pairs boosts synthetic QA accuracy by 88.9%

    EmbGen: Teaching with Reassembled Corpora

    Arun K Lenin +3

  28. cs.AI 2026-05-19 reviewed
    LLMs run code for videos but miss spatial accuracy

    PRISM: A Benchmark for Programmatic Spatial-Temporal Reasoning

    Qiran Zhang +11

  29. cs.LG 2026-05-19 reviewed
    LLM safety benchmarks are orbits under group actions

    The Evaluation Game: Beyond Static LLM Benchmarking

    Paul Wang +3

  30. cs.AI 2026-05-19 reviewed
  31. cs.AI 2026-05-19 reviewed
    Probabilistic recursion lets models sample many reasoning paths

    Generative Recursive Reasoning

    Junyeob Baek +5

  32. cs.CV 2026-05-19 reviewed
    Concept ontology filters noisy negatives to lift chest X-ray zero-shot tasks

    Concept-Guided Noisy Negative Suppression for Zero-Shot Classification and Grounding of Chest X-Ray Findings

    Chenyu Lian +3

  33. cs.CV 2026-05-19 reviewed
    Heat dissipation flow matching outperforms most baselines

    Multi-Scale Generative Modeling with Heat Dissipation Flow Matching

    Jun Ma +4

  34. cs.HC 2026-05-19 reviewed
    Few agent skill specs fully disclose capabilities to users

    Toward User Comprehension Supports for LLM Agent Skill Specifications

    Zikai Alex Wen

  35. cs.HC 2026-05-19 reviewed
    Only 19% of cybersecurity skills include example cues for users

    Toward User Comprehension Supports for LLM Agent Skill Specifications

    Zikai Alex Wen

  36. cs.GR 2026-05-19 reviewed
    Repositioned anchors keep motion contacts across body shapes

    Skinned Motion Retargeting with Spatially Adaptive Interaction Guidance

    Soojin Choi +5

  37. q-bio.NC 2026-05-19 reviewed
    Action models align asymmetrically with brain action signals

    Brain alignment of reasoning and action representations from vision-language and action models during naturalistic gameplay

    Subba Reddy Oota +6

  38. cs.MA 2026-05-19 reviewed
    Architecture lets AI agents break rules legitimately when justified

    PAVE: A Cognitive Architecture for Legitimate Violation in Generative Agent Societies

    Ahmad Yehia +6

  39. cs.LG 2026-05-19 reviewed
    Claim differences as RL rewards balance caption hallucinations and omissions

    ClaimDiff-RL: Fine-Grained Caption Reinforcement Learning through Visual Claim Comparison

    Tianle Li +9

  40. cs.CL 2026-05-19 reviewed
    Supreme Court quashes 18 points more matrimonial petitions than Karnataka HC

    IMLJD: A Computational Dataset for Indian Matrimonial Litigation Analysis

    Joy Bose

  41. cs.CV 2026-05-19 reviewed
    Integral feedback reduces hallucinations in CT medical reports

    Regulating Anatomy-Aware Rewards via Trajectory-Integral Feedback for Volumetric Computed Tomography Analysis

    Tianwei Lin +9

  42. cs.CL 2026-05-19 reviewed
    Benchmark labels hallucinations via explicit reference worlds

    HalluWorld: A Controlled Benchmark for Hallucination via Reference World Models

    Emmy Liu +6

    5 Piths
  43. cs.MA 2026-05-19 reviewed
    STAR-PólyaMath hits perfect scores on Putnam and IMO

    STAR-P\'olyaMath: Multi-Agent Reasoning under Persistent Meta-Strategic Supervision

    Jiaao Wu +5

  44. cs.AI 2026-05-19 reviewed
    Only 2 of 19 LLM trading studies use time-consistent data splits

    Agentic Trading: When LLM Agents Meet Financial Markets

    Yihan Xia +6

  45. q-bio.QM 2026-05-19 reviewed
    Protein Thoughts ranks true binders at mean position 11.2

    Protein Thoughts: Interpretable Reasoning with Tree of Thoughts and Embedding-Space Flow Matching for Protein-Protein Interaction Discovery

    Kingsley Yeon +2

  46. cs.GT 2026-05-19 reviewed
    LLMs close 99% of deals but earn low profits in hidden pricing

    PrefBench: Evaluating Zero-Shot LLM Agents in Hidden-Preference Personalized Pricing Negotiations

    Yingjie Lei

  47. cs.AI 2026-05-19 reviewed
    MOCHA improves agent skill correctness on every task

    MOCHA: Multi-Objective Chebyshev Annealing for Agent Skill Optimization

    Md Mehrab Tanjim +8

  48. cs.CV 2026-05-19 reviewed
    Event streams lift VLM captioning and VQA scores in low light and motion

    RE-VLM: Event-Augmented Vision-Language Model for Scene Understanding

    Hanqing Liu +5

  49. cs.CV 2026-05-19 reviewed
    Event streams improve VLM scene understanding in tough conditions

    RE-VLM: Event-Augmented Vision-Language Model for Scene Understanding

    Hanqing Liu +5

  50. cs.CR 2026-05-19 reviewed
    Small models flag jailbreaks before large models answer

    Exploring and Developing a Pre-Model Safeguard with Draft Models

    Hongyu Cai +4