archive

Every paper Pith has read. Search by title, abstract, or pith.

7661 papers in cs.CL · page 15

cs.CL 2026-05-14 reviewed

Geometry scores pick shallow layers for diffusion insertion in transformers
Where Should Diffusion Enter a Language Model? Geometry-Guided Hidden-State Replacement

Injin Kong +2
cs.CL 2026-05-14 reviewed

Semantic RL adds low-resource languages without erasing prior skills
Reinforcement Learning with Semantic Rewards Enables Low-Resource Language Expansion without Alignment Tax

Zeli Su +9
cs.HC 2026-05-14 reviewed

Short concern texts track with activity drops and sleep issues
A Formative Study of Brief Affective Text as a Complement to Wearable Sensing for Longitudinal Student Health Monitoring

Tamunotonye Harry +9
cs.CL 2026-05-14 reviewed

LLM filter and clustering finds 41 manipulative narrative clusters
LLM-based Detection of Manipulative Political Narratives

Sinclair Schneider +2
cs.CL 2026-05-14 reviewed

Prompt filter and clustering finds 41 narrative clusters
LLM-based Detection of Manipulative Political Narratives

Sinclair Schneider +2
cs.CL 2026-05-14 reviewed

Transformers score German texts on left-right scale
Ideology Prediction of German Political Texts

Sinclair Schneider +3
cs.LG 2026-05-14 reviewed

Dynamic Latent Routing beats supervised fine-tuning by 6.6 points
Dynamic Latent Routing

Fangyuan Yu +2
cs.CL 2026-05-14 reviewed

Exact prefix factorization removes errors in diffusion language models
Factorization-Error-Free Discrete Diffusion Language Model via Speculative Decoding

Xun Fang +3
cs.LG 2026-05-14 reviewed

Minimal KV scorer tweak beats complex cache redesigns
Minimal-Intervention KV Retention via Set-Conditioned Diversity

Libo Sun +3
cs.LG 2026-05-14 reviewed

Simple diversity penalty in KV scorer beats complex designs
Minimal-Intervention KV Retention via Set-Conditioned Diversity

Libo Sun +3
cs.CR 2026-05-14 reviewed

Hidden noise stops vision-language models learning real content
To See is Not to Learn: Protecting Multimodal Data from Unauthorized Fine-Tuning of Large Vision-Language Model

Chengshuai Zhao +4
cs.CR 2026-05-14 reviewed

Web agents should plan before seeing page content
Web Agents Should Adopt the Plan-Then-Execute Paradigm

Julien Piet +7
cs.LG 2026-05-14 reviewed

MetaMoE combines independently trained expert models into one Mixture-of-Experts system…
MetaMoE: Diversity-Aware Proxy Selection for Privacy-Preserving Mixture-of-Experts Unification

Weisen Jiang +2
cs.CL 2026-05-14 reviewed

Agent harnesses break rules mid-task despite safe final answers
Auditing Agent Harness Safety

Chengzhi Liu +10
cs.CL 2026-05-14 reviewed

Agent harnesses allow unsafe actions even with correct final outputs
Auditing Agent Harness Safety

Chengzhi Liu +10
cs.AI 2026-05-14 reviewed

Hypergraph reasoner hits 94.7% on supply chain RCA
Hypergraph Enterprise Agentic Reasoner over Heterogeneous Business Systems

Ling Wang +10
cs.CL 2026-05-14 reviewed

Spelling and test design confound KVL word difficulty ratings
Sakura at BEA 2026 Shared Task 1: What Makes Vocabulary Difficult?

Adam Nohejl +5
cs.CL 2026-05-14 reviewed

LLM rates vocab difficulty at r > 0.91
Sakura at BEA 2026 Shared Task 1: What Makes Vocabulary Difficult?

Adam Nohejl +5
cs.LG 2026-05-14 reviewed

Active learners raise NDCG@10 per call in PRP reranking
Active Learners as Efficient PRP Rerankers

Jerem\'ias Figueiredo Paschmann +5
cs.LG 2026-05-14 reviewed

Active rankers lift NDCG@10 per call in PRP reranking
Active Learners as Efficient PRP Rerankers

Jerem\'ias Figueiredo Paschmann +5
cs.LG 2026-05-14 reviewed

Transformer predicts next disease with 0.871 median AUC across 896 categories
DT-Transformer: A Foundation Model for Disease Trajectory Prediction on a Real-world Health System

Yunying Zhu +3
cs.LG 2026-05-14 reviewed

Small mismatches in LLM RL rollout and optimization cause collapse
Diagnosing Training Inference Mismatch in LLM Reinforcement Learning

Tianle Zhong +7
cs.LG 2026-05-14 reviewed

Prefill-only adapters deliver 1.9x throughput for 512 users
PreFT: Prefill-only finetuning for efficient inference

Andrew Lanpouthakoun +6
cs.CL 2026-05-13 reviewed

Filter drops harmful examples to hold LLM attack rate below 6 percent
GradShield: Alignment Preserving Finetuning

Zhanhao Hu +6
cs.CL 2026-05-13 reviewed

RAG succeeds when evidence flows deeper and more distributed
Why Retrieval-Augmented Generation Fails: A Graph Perspective

Kai Guo +7
quant-ph 2026-05-13 reviewed

Engineered texts recover exact backbones on 100-atom quantum processor
QOuLiPo: What a quantum computer sees when it reads a book

Christophe Jurczak
cs.IR 2026-05-13 reviewed

Imagined future steps triple recall of distant memories
Thinking Ahead: Prospection-Guided Retrieval of Memory with Language Models

Harshita Chopra +4
cs.CL 2026-05-13 reviewed

Search-based bookmarks beat summarization for role-play memory
BOOKMARKS: Efficient Active Storyline Memory for Role-playing

Letian Peng +6
cs.CL 2026-05-13 reviewed

Safety refusals rise with Korean language but drop with Korean context
ROK-FORTRESS: Measuring the Effect of Geopolitical Transcreation for National Security and Public Safety

Michael S. Lee +15
cs.CL 2026-05-13 reviewed

Distance and direction encode relations in LLM embeddings
Polar probe linearly decodes semantic structures from LLMs

Pablo J. Diego-Sim\'on +4
cs.CL 2026-05-13 reviewed

LLMs encode relations as distances and directions in embeddings
Polar probe linearly decodes semantic structures from LLMs

Pablo J. Diego-Sim\'on +4
cs.LG 2026-05-13 reviewed

Routed small models add value to AlphaEarth on hydrology questions
Mini-JEPA Foundation Model Fleet Enables Agentic Hydrologic Intelligence

Mashrekur Rahman
cs.CL 2026-05-13 reviewed

LLM with verifiable RL rewards meets room sizes and connections
Generative Floor Plan Design with LLMs via Reinforcement Learning with Verifiable Rewards

Luis Lara +7
cs.CL 2026-05-13 reviewed

Reversing conflicting document order flips 11-25% of LLM answers
When Evidence Conflicts: Uncertainty and Order Effects in Retrieval-Augmented Biomedical Question Answering

Yikun Han +2
stat.ML 2026-05-13 reviewed

Conformal method bounds confident errors in CoT reasoning
Pause and Reflect: Conformal Aggregation for Chain-of-Thought Reasoning

Yu Gu +3
cs.CL 2026-05-13 reviewed

DExperts blocks explicit toxicity but slips on implicit hate speech
Measuring and Mitigating Toxicity in Large Language Models: A Comprehensive Replication Study

Mokshit Surana +2
cs.CL 2026-05-13 reviewed

DExperts hits 100% safety on explicit toxicity but drops on implicit
Measuring and Mitigating Toxicity in Large Language Models: A Comprehensive Replication Study

Mokshit Surana +2
cs.SE 2026-05-13 reviewed

Constrained edits merge checkpoints to lift code agent scores
CRANE: Constrained Reasoning Injection for Code Agents via Nullspace Editing

Mingzhi Zhu +3
cs.LG 2026-05-13 reviewed

Cosine similarity misleads on which layers matter in LLMs
Rethinking Layer Relevance in Large Language Models Beyond Cosine Similarity

Cristian Hinostroza +6
cs.CL 2026-05-13 reviewed

Adaptive weights fix distribution drift in LLM reasoning distillation
Distribution Corrected Offline Data Distillation for Large Language Models

Yumeng Zhang +3
eess.AS 2026-05-13 reviewed

Benchmark standardizes early Parkinson's speech detection
A Benchmark for Early-stage Parkinson's Disease Detection from Speech

Terry Yi Zhong +5

2 Piths
cs.AI 2026-05-13 reviewed

Early rejection cuts LLM synthetic data tokens by 11-77%
Know When To Fold 'Em: Token-Efficient LLM Synthetic Data Generation via Multi-Stage In-Flight Rejection

Anjir Ahmed Chowdhury +2
cs.CL 2026-05-13 reviewed

Dual RL agents learn to probe like Supreme Court justices
Dual Hierarchical Dialogue Policy Learning for Legal Inquisitive Conversational Agents

Xubo Lin +4
cs.CL 2026-05-13 reviewed

Dual RL agents learn to probe like Supreme Court justices
Dual Hierarchical Dialogue Policy Learning for Legal Inquisitive Conversational Agents

Xubo Lin +4
cs.CL 2026-05-13 reviewed

New method lifts multi-task LLM accuracy by 6.67 percent
PEML: Parameter-efficient Multi-Task Learning with Optimized Continuous Prompts

Anjir Ahmed Chowdhury +4
cs.CL 2026-05-13 reviewed

Logic rules in prompts cut RAG errors via derivation trees
Derivation Prompting: A Logic-Based Method for Improving Retrieval-Augmented Generation

Ignacio Sastre +2
cs.AI 2026-05-13 reviewed

Formal checks can keep AI legal reasoning inside the text
Bridging Legal Interpretation and Formal Logic: Faithfulness, Assumption, and the Future of AI Legal Reasoning

Olivia Peiyu Wang +1
cs.CL 2026-05-13 reviewed

Audited data lifts 8B model 18 points on physics olympiads
Physics-R1: An Audited Olympiad Corpus and Recipe for Visual Physics Reasoning

Shan Yang
cs.LG 2026-05-13 reviewed

Learned predictor prunes KV cache 3-10x on the fly
Self-Pruned Key-Value Attention: Learning When to Write by Predicting Future Utility

Gergely Szilvasy (1) +10
cs.AI 2026-05-13 reviewed

Unary recoding enables polynomial-time rule learning for LLMs
Enhanced and Efficient Reasoning in Large Learning Models

Leslie G. Valiant