pith. sign in

archive

Every paper Pith has read. Search by title, abstract, or pith.

7661 papers in cs.CL · page 5

  1. cs.CL 2026-05-20 reviewed
    Post-editors change one in three literary MT metaphors

    Metaphors in Literary Post-Editing: Opening Pandora's Box?

    Aletta G. Dorst +2

  2. cs.LG 2026-05-20 reviewed
    ChunkFT fits full fine-tuning of 8B models in 14GB GPU memory

    ChunkFT: Byte-Streamed Optimization for Memory-Efficient Full Fine-Tuning

    Yongkang Liu +9

  3. cs.CL 2026-05-20 reviewed
    Fine-tuned LLM reaches 0.866 F1 on Spanish psychiatric ICD coding

    Automated ICD Classification of Psychiatric Diagnoses: From Classical NLP to Large Language Models

    Fernando Ortega +5

  4. cs.LG 2026-05-20 reviewed
    SMoA outperforms LoRA in low-budget fine-tuning

    SMoA: Spectrum Modulation Adapter for Parameter-Efficient Fine-Tuning

    Yongkang Liu +9

  5. cs.CL 2026-05-20 reviewed
    Error highlights and suggestions bring no post-editing speed or quality gains

    Smarter edits? Post-editing with error highlights and translation suggestions

    Fleur V.J. van Tellingen +6

  6. cs.CL 2026-05-20 reviewed
    Small classifier beats LLMs at pulling exact text from papers

    ACL-Verbatim: hallucination-free question answering for research

    G\'abor Recski +4

  7. cs.CL 2026-05-20 reviewed
    Extractors score 0.93 on articles but only 0.41-0.84 on other pages

    WCXB: A Multi-Type Web Content Extraction Benchmark

    Murrough Foley

  8. cs.CL 2026-05-20 reviewed
    LLMs unstable on Korean honorifics for car assistants

    LoCar: Localization-Aware Evaluation of In-Vehicle Assistants through Fine-Grained Sociolinguistic Control

    Seogyeong Jeong +6

  9. cs.CL 2026-05-20 reviewed
    LLMs reach 0.91 agreement grading public law exams

    GradeLegal: Automated Grading for German Legal Cases

    Abdullah Al Zubaer +4

  10. cs.CL 2026-05-20 reviewed
    ClaimRAG-LAW benchmark separates retrieval and generation errors in legal RAG

    Fine-grained Claim-level RAG Benchmark for Law

    Souvick Das +2

  11. cs.CL 2026-05-20 reviewed
    New benchmark separates retrieval from generation errors in legal RAG

    Fine-grained Claim-level RAG Benchmark for Law

    Souvick Das +2

  12. cs.CL 2026-05-20 reviewed
    New dataset separates retrieval from generation in legal RAG

    Fine-grained Claim-level RAG Benchmark for Law

    Souvick Das +2

  13. cs.CL 2026-05-20 reviewed
    Routing leads in adapting LLMs to hidden style preferences

    APM: Evaluating Style Personalization in LLMs with Arbitrary Preference Mappings

    Philipp Spohn +2

  14. cs.CL 2026-05-20 reviewed
    LLM-brain alignment stable across languages but not from surprise or compression

    Cross-lingual robustness of LLM-brain alignment and its computational roots

    Ni Yang +5

  15. cs.CL 2026-05-20 reviewed
    Less data yields clearer AI skills taxonomies

    Building a Custom Taxonomy of AI Skills and Tasks from the Ground Up with Job Postings

    Stephen Meisenbacher +1

  16. cs.CL 2026-05-20 reviewed
    Agent turns natural language into governed enterprise API calls

    Beyond Text-to-SQL: An Agentic LLM System for Governed Enterprise Analytics APIs

    Gundeep Singh +7

  17. cs.AI 2026-05-20 reviewed
    Off-the-shelf persona vectors rival targeted sycophancy steering

    Playing Devil's Advocate: Off-the-Shelf Persona Vectors Rival Targeted Steering for Sycophancy

    Ishaan Kelkar +5

  18. cs.CL 2026-05-20 reviewed
    DABS cuts multi-aspect sentiment computation by up to 60%

    Single-Pass, Depth-Selective Reading for Multi-Aspect Sentiment Analysis

    Yan Xia +3

  19. cs.CL 2026-05-20 reviewed
    Anchor regularization makes LLM safety consistent across prompt variations

    Towards Context-Invariant Safety Alignment for Large Language Models

    Yixu Wang +6

  20. cs.CL 2026-05-20 reviewed
    Arabic memes dataset finds Islamist and satirical ones most hostile

    ArPoMeme: An Annotated Arabic Multimodal Dataset for Political Ideology and Polarization

    Wajdi Zaghouani +3

  21. cs.CL 2026-05-20 reviewed
    Arabic job ads corpus shows gendered hiring patterns persist

    JobArabi: An Arabic Corpus and Analysis of Job Announcements from Social Media

    Wajdi Zaghouani +3

  22. cs.CL 2026-05-20 reviewed
    Grafted hidden states raise language model scores over MoE and Engram

    Memory Grafting: Scaling Language Model Pre-training via Offline Conditional Memory

    Runxi Cheng +9

  23. cs.CL 2026-05-20 reviewed
    Interleaved reasoning boosts speech AI math scores by 13%

    Thinking-while-speaking: A Controlled, Interleaved Reasoning Method for Real-Time Speech Generation

    Xuan Du +6

  24. cs.LG 2026-05-20 reviewed
    DASH discovers strong hybrid attention for LLMs in 20 minutes on one GPU

    DASH: Fast Differentiable Architecture Search for Hybrid Attention in Minutes on a Single GPU

    Weizhe Chen +5

  25. cs.CL 2026-05-20 reviewed
    Strategy induction from questions alone improves LLM task instructions

    Strategy-Induct: Task-Level Strategy Induction for Instruction Generation

    Po-Chun Chen +2

  26. cs.CL 2026-05-20 reviewed
    Phoneme recognition proxies articulatory synthesis quality

    Evaluating Speech Articulation Synthesis with Articulatory Phoneme Recognition

    Vinicius Ribeiro +1

  27. cs.CL 2026-05-20 reviewed
    Task-routed experts lift implicit sentiment scores

    Task-Routed Mixture-of-Experts with Cognitive Appraisal for Implicit Sentiment Analysis

    Yaping Chai +2

  28. cs.CL 2026-05-20 reviewed
    Unlearned models keep low calibration but lean on shortcuts

    Calibration vs Decision Making: Revisiting the Reliability Paradox in Unlearned Language Models

    Divyaksh Shukla +1

  29. cs.CL 2026-05-20 reviewed
    Corpora fine-tune machine translation for science

    Enhancing Scientific Discourse: Machine Translation for the Scientific Domain

    Dimitris Roussis +2

  30. cs.CL 2026-05-20 reviewed
    Skill synthesis scales terminal-agent data to beat baselines with 1% of it

    Terminal-World: Scaling Terminal-Agent Environments via Agent Skills

    Zihao Cheng +8

  31. cs.SI 2026-05-20 reviewed
    Multi-metric score spots synthetic narratives more reliably

    Detecting Synthetic Political Narratives in Cross-Platform Social Media Discourse

    Despoina Antonakaki +1

  32. cs.CL 2026-05-20 reviewed
    MemGym isolates memory from reasoning in agent benchmarks

    MemGym: a Long-Horizon Memory Environment for LLM Agents

    Wujiang Xu +10

  33. cs.CL 2026-05-20 reviewed
    7B open LLMs run GraphRAG locally for EHR schema queries

    GraphRAG on Consumer Hardware: Benchmarking Local LLMs for Healthcare EHR Schema Retrieval

    Peter Fernandes +1

  34. cs.CL 2026-05-20 reviewed
    Column-sparse attention nearly doubles diffusion LLM speed

    PulseCol: Periodically Refreshed Column-Sparse Attention for Accelerating Diffusion Language Models

    Yanyi Lyu +5

  35. cs.CL 2026-05-20 reviewed
    Refined guidelines help LLMs match biomedical expert annotations

    Refining and Reusing Annotation Guidelines for LLM Annotation

    Kon Woo Kim +2

  36. cs.LG 2026-05-20 reviewed
    Only two of 20 Transformer modifications transfer at 1-3B

    Most Transformer Modifications Still Do Not Transfer at 1-3B: A 2020-2026 Update to Narang et al. (2021) with Downstream Evaluation and a Noise Floor

    Yang Zhao +4

  37. cs.CL 2026-05-20 reviewed
    Guidelines for text data raise consistency in climate impact datasets

    Assessing socio-economic climate impacts from text data

    Mariana Madruga de Brito +17

  38. cs.CL 2026-05-20 reviewed
    Social barriers outweigh linguistic ones in Arabic NLP

    Building Arabic NLP from the Ground Up: Twenty Years of Lessons, Failures, and Open Problems

    Wajdi Zaghouani

  39. cs.CL 2026-05-20 reviewed
    LLM interventions create user drift that biases simulated experiments

    The Illusion of Intervention: Your LLM-Simulated Experiment is an Observational Study

    Victoria Lin +5

  40. cs.CL 2026-05-20 reviewed
    Detectors separate human and AI text well but lag on naming the model

    Findings of the Counter Turing Test: AI-Generated Text Detection

    Rajarshi Roy +18

  41. cs.LG 2026-05-20 reviewed
    Hidden states at paragraph boundaries tune verifier strictness

    The Hidden Signal of Verifier Strictness: Controlling and Improving Step-Wise Verification via Selective Latent Steering

    Yefan Zhou +5

  42. cs.CV 2026-05-20 reviewed
    Constraint engine turns AI drawings into verifiable geometry reasoning

    Draw2Think: Harnessing Geometry Reasoning through Constraint Engine Interaction

    Juncheng Hu +3

  43. cs.LG 2026-05-20 reviewed
    RL scores full distributions to fix LLM regression

    Distribution-Aware Reward: Reinforcement Learning over Predictive Distributions for LLM Regression

    Jungsoo Park +6

  44. cs.CL 2026-05-20 reviewed
    Aligning task vectors to in-context next-token distributions lifts accuracy 9.2%

    Distributional Alignment as a Criterion for Designing Task Vectors in In-Context Learning

    Jihoon Kwon +2

  45. cs.CL 2026-05-20 reviewed
    Framework synthesizes realistic conversational retrieval benchmarks

    MTR-Suite: A Framework for Evaluating and Synthesizing Conversational Retrieval Benchmarks

    Junhao Ruan +10

  46. cs.CL 2026-05-20 reviewed
    Categorical error rates beat WER for Indic speech recognition

    SCRIBE: Diagnostic Evaluation and Rich Transcription Models for Indic ASR

    Kavya Manohar +3

  47. cs.CL 2026-05-20 reviewed
    Agreement screening yields clearer text features at full accuracy

    Interpretable Discriminative Text Representations via Agreement and Label Disentanglement

    Tong Wang +2

  48. cs.CL 2026-05-20 reviewed
    Self-limiting losses compress embeddings without overfitting

    DIVE: Embedding Compression via Self-Limiting Gradient Updates

    Dongfang Zhao

  49. cs.CL 2026-05-20 reviewed
    Utility ranking trims credit document review to three minutes

    Beyond Semantic Similarity: A Two-Phase Non-Parametric Retrieval Workflow for Corporate Credit Underwriting

    Linus Ng Junjia +4

  50. cs.CL 2026-05-20 reviewed
    AI reviewer beats top human on Nature papers

    On the limits and opportunities of AI reviewers: Reviewing the reviews of Nature-family papers with 45 expert scientists

    Seungone Kim +57