pith. sign in

archive

Every paper Pith has read. Search by title, abstract, or pith.

7661 papers in cs.CL · page 10

  1. cs.AI 2026-05-18 reviewed
    Entropy-gradient inversion marks stronger reasoning in LRMs

    Entropy-Gradient Inversion: Moving Toward Internal Mechanism of Large Reasoning Models

    Junyao Yang +6

  2. cs.AI 2026-05-18 reviewed
    Entropy-gradient inversion marks stronger reasoning in large models

    Entropy-Gradient Inversion: Moving Toward Internal Mechanism of Large Reasoning Models

    Junyao Yang +6

  3. cs.CL 2026-05-18 reviewed
    Mixing ICD-9 and ICD-10 data lifts rare code F1 by 27 percent

    Bridging the Version Gap: Multi-version Training Improves ICD Code Prediction, Especially for Rare Codes

    Jinghui Liu +1

  4. cs.CL 2026-05-18 reviewed
    Topic models assign themes to segments

    From Documents to Segments: A Contextual Reformulation for Topic Assignment

    Hoonsang Yoon +5

  5. cs.CL 2026-05-18 reviewed
    Distillation cuts error rates for Nigerian speech recognition by 29%

    Sometin Beta Pass Notin (SBPN): Improving Multilingual ASR for Nigerian Languages via Knowledge Distillation

    Sewade Ogun

  6. cs.LG 2026-05-18 reviewed
    Persistent margins, not drifts, carry safety signals across LLM layers

    Geometry-Lite: Interpretable Safety Probing via Layer-Wise Margin Geometry

    Woo Seob Sim +1

  7. cs.CL 2026-05-17 reviewed
    LLMs mirror human power biases in simulated talks

    Do LLM Agents Mirror Socio-Cognitive Effects in Power-Asymmetric Conversations?

    Anvesh Rao Vijjini +2

  8. cs.CL 2026-05-17 reviewed
    LLMs copy human power dynamics in role-play dialogues

    Do LLM Agents Mirror Socio-Cognitive Effects in Power-Asymmetric Conversations?

    Anvesh Rao Vijjini +2

  9. cs.CL 2026-05-17 reviewed
    Gemini leads LLM benchmark on legal precedent classification

    Validate Your Authority: Benchmarking LLMs on Multi-Label Precedent Treatment Classification

    M. Mikail Demir +1

  10. cs.CL 2026-05-17 reviewed
    Reasoning models cut 26% tokens by exiting at semantic convergence

    Stop When Reasoning Converges: Semantic-Preserving Early Exit for Reasoning Models

    Dehai Min +5

  11. cs.CL 2026-05-17 reviewed
    Peer editing with audio matches speech summary quality to transcripts

    Beyond Transcripts: Iterative Peer-Editing with Audio Unlocks High-Quality Human Summaries of Conversational Speech

    Kaavya Chaparala +5

  12. cs.AI 2026-05-17 reviewed
    Causal tests select better memories for long-running AI agents

    Causal Intervention-Based Memory Selection for Long-Horizon LLM Agents

    Saksham Sahai Srivastava

  13. cs.CL 2026-05-17 reviewed
    Co-citation predictability drops over 20 years

    Temporal Decay of Co-Citation Predictability: A 20-Year Statute Retrieval Benchmark from 396M Ukrainian Court Citations

    Volodymyr Ovcharov

  14. cs.CR 2026-05-17 reviewed
    Adversary can always reframe prompt injections as legitimate

    AI Agents May Always Fall for Prompt Injections

    Sahar Abdelnabi +1

  15. cs.CV 2026-05-17 reviewed
    Fast-slow video guardrail tops larger models at lower cost

    SafeLens: Deliberate and Efficient Video Guardrails with Fast-and-Slow Screening

    Shahriar Kabir Nahin +3

  16. cs.CL 2026-05-17 reviewed
    MoE models show deep-layer routing collapse for low-resource languages

    Mixture of Experts for Low-Resource LLMs

    Ori Bar Joseph +4

  17. cs.LG 2026-05-17 reviewed
    Mu-GRPO halves LLM RL wall-clock time with stale rollouts

    How Off-Policy Can GRPO Be? Mu-GRPO for Efficient LLM Reinforcement Learning

    Minghao Tian +2

  18. cs.AI 2026-05-17 reviewed
    Small chess model tops larger ones via pattern matching

    Generalization or Memorization? Brittleness Testing for Chess-Trained Language Models

    Ethan Tang

  19. cs.SE 2026-05-17 reviewed
    Inverted API exploration yields verified tool-call data

    Firefly: Illuminating Large-Scale Verified Tool-Call Data Generation from Real APIs

    Yuxuan Lu +14

  20. cs.LG 2026-05-17 reviewed
    CausalSynth makes LLM synthetic data obey causal graphs

    CasualSynth: Generating Structurally Sound Synthetic Data

    Zehua Cheng +3

  21. cs.AI 2026-05-17 reviewed
    EEG-to-text pipeline beats random baseline by 30 percent

    RAG-based EEG-to-Text Translation Using Deep Learning and LLMs

    Enrico Collautti +4

  22. cs.CL 2026-05-17 reviewed
    Decomposition separates context anchors in ambiguous word embeddings

    RSD: A Local Triangulation Audit Primitive for Learned Vector Blocks

    Seungmin Jin

  23. cs.CL 2026-05-17 reviewed
    Hybrid features raise CNN recall for Bangla fake news

    Hybrid Feature Combinations with CNN for Bangla Fake News Classification

    Md Gulzar Hussain +2

  24. cs.CL 2026-05-17 reviewed
    Verifying hypotheses attributes failures better in multi-agent LLMs

    VerifyMAS: Hypothesis Verification for Failure Attribution in LLM Multi-Agent Systems

    Hezhe Qiao +4

  25. cs.CR 2026-05-17 reviewed
    Tool-using AI agents can be poisoned after trust is built

    Trust No Tool: Evaluating and Defending LLM Agents under Untrusted Tool Feedback

    Lecheng Yan +7

  26. cs.SE 2026-05-17 reviewed
    ContraFix fixes 84% of C/C++ vulnerabilities at low cost

    ContraFix: Agentic Vulnerability Repair via Differential Runtime Evidence and Skill Reuse

    Simiao Liu +4

  27. cs.GR 2026-05-17 reviewed
    FEA feedback lifts CAD agents past 20 percent requirement compliance

    Self-Improving CAD Generation Agents with Finite Element Analysis as Feedback

    Guijin Son +4

  28. cs.CV 2026-05-17 reviewed
    Dynamic fixation keeps 98% OCR accuracy with 5% visual tokens

    FastOCR: Dynamic Visual Fixation via KV Cache Pruning for Efficient Document Parsing

    Zihan Tang +9

  29. cs.LG 2026-05-17 reviewed
    DiDi-Merging matches baselines at 1.24x single-model size

    Dynamic Model Merging Made Slim

    Guodong Du +1

  30. cs.SE 2026-05-17 reviewed
    Memory layers raise repo vulnerability repair to 58%

    MemRepair: Hierarchical Memory for Agentic Repository-Level Vulnerability Repair

    Simiao Liu +5

  31. cs.CL 2026-05-17 reviewed
    ASR errors degrade Korean QA the same relative amount across LLMs

    Analyzing Error Propagation in Korean Spoken QA with ASR-LLM Cascades

    Donghyuk Jung +1

  32. cs.CL 2026-05-17 reviewed
    Catalogues miss 609 datasets across 53 languages

    Beyond Catalogue Counts: the Dataset Visibility Asymmetry in Low-Resource Multilingual NLP

    Zhiyin Tan +1

  33. cs.CV 2026-05-17 reviewed
    Text overrides images in clinical vision models

    Medical Context Distorts Decisions in Clinical Vision Language Models

    David Restrepo +4

  34. cs.CL 2026-05-17 reviewed
    Structured evidence fusion improves biomedical QA across LLMs

    BELIEF: Structured Evidence Modeling and Uncertainty-Aware Fusion for Biomedical Question Answering

    Chang Zong +4

  35. cs.CL 2026-05-17 reviewed
    MiniGPT hits 1.478 loss and Shakespeare dialogue

    MiniGPT: Rebuilding GPT from First Principles

    Jibin Joseph

  36. cs.AI 2026-05-17 reviewed
    Small expert annotations calibrate LLMs to match human judgments on generative AI

    QQJ: Quantifying Qualitative Judgment for Scalable and Human-Aligned Evaluation of Generative AI

    Marjan Veysi +3

  37. cs.CL 2026-05-17 reviewed
    Domain token swaps reduce training time 35-55% for LLM summarization

    Learning Faster with Better Tokens: Parameter-Efficient Vocabulary Adaptation for Specialized Text Summarization

    Gunjan Balde +3

  38. cs.CL 2026-05-17 reviewed
    Five agents map news bias by exposing omissions and manipulations

    NewsLens: A Multi-Agent Framework for Adversarial News Bias Navigation

    Joy Bose

  39. cs.CL 2026-05-17 reviewed
    Offline priors initialize better multi-agent LLM graphs

    Learning Transferable Topology Priors for Multi-Agent LLM Collaboration Across Domains

    Taolin Zhang +6

  40. cs.AI 2026-05-17 reviewed
    Hypergraph links text levels for stronger personality prediction

    HyperPersona: A Multi-Level Hypergraph Framework for Text-Based Automatic Personality Prediction

    Sina Heydari +1

  41. cs.CL 2026-05-17 reviewed
    Multi-agent alignment lifts factual accuracy on knowledge QA

    AMATA: Adaptive Multi-Agent Trajectory Alignment for Knowledge-Intensive Question Answering

    Taolin Zhang +7

  42. cs.CL 2026-05-17 reviewed
    State transitions keep recovering agents alive in LLM teams

    Taming "Zombie'' Agents: A Markov State-Aware Framework for Resilient Multi-Agent Evolution

    Taolin Zhang +7

  43. cs.CL 2026-05-17 reviewed
    Decomposition separates cyclic preferences for better LLM alignment

    Transitivity Meets Cyclicity: Explicit Preference Decomposition for Dynamic Large Language Model Alignment

    Yucong Huang +3

  44. cs.CL 2026-05-17 reviewed
    Mismatched wrong drafts boost GRPO math performance

    Weak-to-Strong Elicitation via Mismatched Wrong Drafts

    Wei Deng

  45. cs.AI 2026-05-17 reviewed
    Control loop raises LLM self-correction accuracy by 6.2 points

    CyberCorrect: A Cybernetic Framework for Closed-Loop Self-Correction in Large Language Models

    Yuning Wu +2

  46. cs.LG 2026-05-17 reviewed
    Context Codec verifies which commitments survive LLM context compression

    Compress the Context, Keep the Commitments: A Formal Framework for Verifiable LLM Context Compression

    Natalia Trukhina +1

  47. cs.CL 2026-05-17 reviewed
    ConflictRAG resolves document conflicts to raise RAG accuracy

    ConflictRAG: Detecting and Resolving Knowledge Conflicts in Retrieval Augmented Generation

    Chenyu Wang +3

  48. cs.LG 2026-05-17 reviewed
    Offline sampling freezes partition function before LLM-RL policy updates

    DISA: Offline Importance Sampling for Distribution-Matching LLM-RL

    Shaobo Wang +11

  49. cs.CL 2026-05-17 reviewed
    Agentic training loop lifts Lean prover to record Pass@32 scores

    OProver: A Unified Framework for Agentic Formal Theorem Proving

    David Ma +9

  50. cs.LG 2026-05-17 reviewed
    Pullback Fisher metric gives closed-form optimal activation steering

    FishBack: Pullback Fisher Geometry for Optimal Activation Steering in Transformers

    Sihan Wang +1