pith. sign in

archive

Every paper Pith has read. Search by title, abstract, or pith.

7661 papers in cs.CL · page 12

  1. cs.LG 2026-05-16 reviewed
    Hard examples stay unlearnable in RLVR despite correct rollouts

    The Unlearnability Phenomenon in RLVR for Language Models

    Yulin Chen +2

  2. cs.LG 2026-05-16 reviewed
    LLMs forget targeted knowledge via neutral remaps and closed-form edits

    ZeroUnlearn: Few-Shot Knowledge Unlearning in Large Language Models

    Yujie Lin +4

  3. cs.LG 2026-05-16 reviewed
    Closed-form update unlearns sensitive LLM knowledge in few shots

    ZeroUnlearn: Few-Shot Knowledge Unlearning in Large Language Models

    Yujie Lin +4

  4. cs.CL 2026-05-16 reviewed
    Lightweight LLMs match DNNs on court view generation

    Exploring Lightweight Large Language Models for Court View Generation

    Zhitian Hou +4

  5. cs.CL 2026-05-16 reviewed
    Retrieval assigns legal labels without hallucinations or retraining

    Retrieval-Based Multi-Label Legal Annotation: Extensible, Data-Efficient and Hallucination-Free

    Li Zhang +2

  6. cs.CL 2026-05-16 reviewed
    500-step pre-training on structured language matches LLM efficiency and adds human-like抵抗

    Language Acquisition Device in Large Language Models

    Masato Mita +3

  7. cs.LG 2026-05-16 reviewed
    fMRI decodes continuous affect for individualized caption rewriting

    EmoMind: Decoding Affective Captions from Human Brain fMRI

    Bilal A. Mohammed +2

  8. cs.AI 2026-05-15 reviewed
    Bounding commitments cuts personalization failures to zero

    Recall Isn't Enough: Bounding Commitments in Personalized Language Systems

    Rui Tang +6

  9. cs.CL 2026-05-15 reviewed
    AI Agents Complete Only 28% of Healthcare Workflow Tasks

    CHI-Bench: Can AI Agents Automate End-to-End, Long-Horizon, Policy-Rich Healthcare Workflows?

    Haolin Chen +32

  10. cs.CL 2026-05-15 reviewed
    RoBERTa spots manner and result verbs at 89.6% accuracy

    A Scalable Tool for Measuring Manner and Result Verbs in Developmental Language Research

    Divyesh Pratap Singh +6

  11. cs.CL 2026-05-15 reviewed
    Graphs track dialogue state for better consistency checks

    SKG-Eval: Stateful Evaluation of Multi-Turn Dialogue via Incremental Semantic Knowledge Graphs

    Avijit Shil +1

  12. cs.CL 2026-05-15 reviewed
    Generative models output continuous emotion intensity scores

    Beyond Sentiment Classification: A Generative Framework for Emotion Intensity Evaluation in Text

    Francesco A. Fabozzi +2

  13. cs.LG 2026-05-15 reviewed
    Standard embeddings match MRL after truncation except at 80% cuts

    To MRL or not to MRL: Text Embeddings are Robust to Truncation Without Matryoshka Learning, Except In Heavy Truncation Scenarios

    Sotaro Takeshita +3

  14. cs.LG 2026-05-15 reviewed
    Alignment updates concentrate in transformer read pathways

    Where Pretraining writes and Alignment reads: the asymmetry of Transformer weight space

    Valeria Ruscio +2

  15. cs.CL 2026-05-15 reviewed
    HTML versions advance math paper accessibility with MathML 4

    Scaling Accessible Mathematics on arXiv: HTML Conversion and MathML 4

    Deyan Ginev +4

  16. cs.CL 2026-05-15 reviewed
    PQR finds 23-78% more QA agent failures with realistic queries

    PQR: A Framework to Generate Diverse and Realistic User Queries that Elicit QA Agent Failures

    Yunan Lu +4

  17. cs.LG 2026-05-15 reviewed
    Symphony outperforms on medical speech recognition

    Symphony for Speech-to-Text: Supporting Real-Time Medical Voice Interfaces

    Arne Nix +8

  18. cs.LG 2026-05-15 reviewed
    Medical speech system outperforms current leaders on clinical tasks

    Symphony for Speech-to-Text: Supporting Real-Time Medical Voice Interfaces

    Arne Nix +8

  19. cs.HC 2026-05-15 reviewed
    LLMs require critical parameter choices for qualitative work

    LLMs in Qualitative Research: Opportunities, Limitations, and Practical Considerations

    Henry Salgado +3

  20. cs.HC 2026-05-15 reviewed
    LLM outputs drift toward past context in extended chats

    Alignment Drift in Long-Term Human-LLM Interaction: A Mechanism-Oriented Framework

    Xintong Yao

  21. cs.CL 2026-05-15 reviewed
    Decay slope couples routing loss and execution rescue in LLM agent libraries

    The Scaling Laws of Skills in LLM Agent Systems

    Charles Chen +14

  22. cs.CL 2026-05-15 reviewed
    One framework turns utility numbers into readable bills with carbon totals

    A Generative AI Framework for Intelligent Utility Billing CO 2 Analytics and Sustainable Resource Optimisation

    Pavan Manjunath +1

  23. cs.CY 2026-05-15 reviewed
    AI edits can steer collective opinions across networks

    AI-Mediated Communication Can Steer Collective Opinion

    Stratis Tsirtsis +4

  24. cs.LG 2026-05-15 reviewed
    Swap test choice changes safe layers for transformer pruning

    No Free Swap: Protocol-Dependent Layer Redundancy in Transformers

    Gabriel Garcia

  25. cs.AI 2026-05-15 reviewed
    Population broadcast lifts LLM agent returns up to 7.7x without weight updates

    FORGE: Self-Evolving Agent Memory With No Weight Updates via Population Broadcast

    Igor Bogdanov +5

  26. cs.CL 2026-05-15 reviewed
    AI framework unifies gas distribution

    A Unified Generative-AI Framework for Smart Energy Infrastructure: Intelligent Gas Distribution, Utility Billing, Carbon Analytics, and Quantum-Inspired Optimisation

    Pavan Manjunath +1

  27. cs.CL 2026-05-15 reviewed
    Lesioned models produce aphasia symptoms unlike human cases

    Artificial Aphasias in Lesioned Language Models

    Nathan Roll +3

  28. cs.CL 2026-05-15 reviewed
    Evidence graph dispatches parallel searchers to reach 86.2 on BrowseComp

    Argus: Evidence Assembly for Scalable Deep Research Agents

    Zhen Zhang +9

  29. cs.CL 2026-05-15 reviewed
    Navigator assembles research from complementary evidence pieces

    Argus: Evidence Assembly for Scalable Deep Research Agents

    Zhen Zhang +9

  30. cs.AI 2026-05-15 reviewed
    Open pipeline lifts clinical LLMs to new benchmark highs

    Fully Open Meditron: An Auditable Pipeline for Clinical LLMs

    Xavier Theimer-Lienhard +7

  31. cs.AI 2026-05-15 reviewed
    LLM tutors spot optimal steps but over-accept wrong solutions

    Confirming Correct, Missing the Rest: LLM Tutoring Agents Struggle Where Feedback Matters Most

    Tahreem Yasir +5

  32. cs.AI 2026-05-15 reviewed
    State abstraction yields 76% higher returns per token for LLM agents

    Context, Reasoning, and Hierarchy: A Cost-Performance Study of Compound LLM Agent Design in an Adversarial POMDP

    Igor Bogdanov +5

  33. cs.CL 2026-05-15 reviewed
    Value profiles from surveys cut LLM cross-country errors

    Improving Cross-Cultural Survey Simulation with Calibrated Value Personas

    Axel Abels +3

  34. cs.CL 2026-05-15 reviewed
    AI tree search finds better 3D solar panel shapes

    Optimized Three-Dimensional Photovoltaic Structures with LLM guided Tree Search

    Michael P. Brenner +2

  35. cs.AI 2026-05-15 reviewed
    Exploration-first training improves LLM agents in new environments

    Look Before You Leap: Autonomous Exploration for LLM Agents

    Ziang Ye +8

  36. cs.CL 2026-05-15 reviewed
    External subgraphs guide LLMs to sharper multi-step answers

    SGR: A Stepwise Reasoning Framework for LLMs with External Subgraph Generation

    Xin Zhang +4

  37. cs.CL 2026-05-15 reviewed
    Retrieval reverses bias to make LLMs fairer without tuning

    DebiasRAG: A Tuning-Free Path to Fair Generation in Large Language Models through Retrieval-Augmented Generation

    Rui Chu +8

  38. cs.CL 2026-05-15 reviewed
    Token relations fix bias in machine text detection

    Multi-Level Contextual Token Relation Modeling for Machine-Generated Text Detection

    Chenwang Wu +4

  39. cs.DL 2026-05-15 reviewed
    Generative AI supports literature reviews through summarization and queries

    Generative Artificial Intelligence for Literature Reviews

    Gerit Wagner +4

  40. cs.CL 2026-05-15 reviewed
    LLM similarity selection lowers error for low cognitive scores

    Can Large Language Models Imitate Human Speech for Clinical Assessment? LLM-Driven Data Augmentation for Cognitive Score Prediction

    Si-Belkacem Yamine Ketir +5

  41. cs.AI 2026-05-15 reviewed
    Hybrid AI beats LLMs on unseen tax law cases

    Reasoners or Translators? Contamination-aware Evaluation and Neuro-Symbolic Robustness in Tax Law

    Parisa Kordjamshidi +4

  42. cs.CL 2026-05-15 reviewed
    RecMem cuts LLM agent memory costs by up to 87%

    RecMem: Recurrence-based Memory Consolidation for Efficient and Effective Long-Running LLM Agents

    Zijie Dai +6

  43. cs.CL 2026-05-15 reviewed
    Typological priors replace flat labels for better multilingual S2ST

    From Flat Language Labels to Typological Priors: Structured Language Conditioning for Multilingual Speech-to-Speech Translation

    Yu Pan +6

  44. cs.CL 2026-05-15 reviewed
  45. cs.CL 2026-05-15 reviewed
    VLMs vary in adapting math lessons to student profiles

    Can Vision Language Models Be Adaptive in Mathematics Education? A Learner Model-based Rubric Study

    Jie Gao +5

  46. cs.CL 2026-05-15 reviewed
    Taxonomy separates AI cultural knowledge from framing and adaptation

    Defining Cultural Capabilities for AI Evaluation: A Taxonomy Grounded in Intercultural Communication Theory

    Isar Nejadgholi +3

  47. cs.CL 2026-05-15 reviewed
    Symbolic ontology extracts facts from police report narratives

    Ontology for Policing: Conceptual Knowledge Learning for Semantic Understanding and Reasoning in Law Enforcement Reports

    Anita Srbinovska +3

  48. cs.CL 2026-05-15 reviewed
    RL fine-tunes MT models to +5 chrF++ without any parallel data

    Reference-Free Reinforcement Learning Fine-Tuning for MT: A Seq2Seq Perspective

    Ernesto Garcia-Estrada +2

  49. cs.HC 2026-05-15 reviewed
    Graduated signals let AI companions flag risks without false alarms on positive states

    SLIP & ETHICS: Graduated Intervention for AI Emotional Companions

    Minseo Kim

  50. cs.CL 2026-05-15 reviewed
    Block attention nears full performance via semantic blocks

    Towards Generalization of Block Attention via Automatic Segmentation and Block Distillation

    Shuaiyi Li +7