Title resolution pending

· 2024

12 Pith papers cite this work. Polarity classification is still indexing.

12 Pith papers citing it

Title metadata for this work has not finished resolving. The hub is built from the citation graph; the title resolver retries DOI and OpenAlex on its next pass.

citation-role summary

other 1

citation-polarity summary

unclear 1

representative citing papers

The Coupling Tax: How Shared Token Budgets Undermine Visible Chain-of-Thought Under Fixed Output Limits

cs.LG · 2026-05-08 · unverdicted · novelty 8.0

Shared token budgets between visible chain-of-thought and answers create a coupling tax that makes non-thinking competitive on math benchmarks, with a truncation decomposition predicting the crossover and split budgets improving results.

LOFT: Low-Rank Orthogonal Fine-Tuning via Task-Aware Support Selection

cs.LG · 2026-05-12 · unverdicted · novelty 7.0

LOFT unifies orthogonal PEFT by treating adaptation as low-rank subspace rotation and adds task-aware support selection that improves efficiency under fixed budgets.

ProactBench: Beyond What The User Asked For

cs.LG · 2026-05-09 · unverdicted · novelty 7.0

ProactBench measures LLM conversational proactivity in three phases using 198 multi-agent dialogues and finds recovery behavior hard to predict from existing benchmarks.

Design and Report Benchmarks for Knowledge Work

cs.AI · 2026-05-22 · unverdicted · novelty 6.0

Proposes a three-step benchmark design method (define work activity, specify tested setting, score work product) derived from work studies and O*NET, demonstrated via three case analyses.

An Efficient Streaming Video Understanding Framework with Agentic Control

cs.CV · 2026-05-18 · unverdicted · novelty 6.0

R3-Streaming uses cascaded control, age-aware memory forgetting, and TB-GRPO reinforcement learning to reach SOTA scores on streaming video benchmarks while cutting visual token usage by 95-96%.

To See is Not to Learn: Protecting Multimodal Data from Unauthorized Fine-Tuning of Large Vision-Language Model

cs.CR · 2026-05-14 · unverdicted · novelty 6.0

MMGuard generates unlearnable multimodal examples via perturbations that exploit LVLM optimization shortcuts and disrupt cross-modal bindings, providing robust protection against unauthorized fine-tuning across threat models.

Principles and Guidelines for Randomized Controlled Trials in AI Evaluation

cs.CY · 2026-05-03 · unverdicted · novelty 6.0

The authors adapt established RCT validity principles from other fields into a standardized framework with 33 guidelines tailored to AI evaluation contexts.

torchtune: PyTorch native post-training library

cs.LG · 2026-05-20 · unverdicted · novelty 5.0

torchtune is a modular PyTorch library for LLM post-training that delivers competitive performance and memory efficiency while supporting rapid research iteration through hackable components.

LLARS: Enabling Domain Expert & Developer Collaboration for LLM Prompting, Generation and Evaluation

cs.AI · 2026-05-11 · unverdicted · novelty 5.0

LLARS is a new integrated platform that combines collaborative prompt authoring, cost-controlled batch generation, and hybrid evaluation to help domain experts and developers jointly build and assess LLM systems.

Unleashing Scalable Context Parallelism for Foundation Models Pre-Training via FCP

cs.DC · 2026-05-08 · unverdicted · novelty 5.0

FCP shards sequences at block level with flexible P2P communication and bin-packing to achieve near-linear scaling up to 256 GPUs and 1.13x-2.21x higher attention MFU in foundation model pre-training.

Retrieval-Conditioned Topology Selection with Provable Budget Conservation for Multi-Agent Code Generation

cs.AI · 2026-05-07 · unverdicted · novelty 5.0

RGAO combines retrieval-based complexity assessment with a formal budget algebra to enable dynamic topology selection in multi-agent code generation with provable conservation.

Chain of Thought Monitorability: A New and Fragile Opportunity for AI Safety

cs.AI · 2025-07-15 · unverdicted · novelty 5.0

Chain-of-thought monitorability provides a promising but fragile method for AI safety oversight that developers should actively preserve.

citing papers explorer

Showing 12 of 12 citing papers.

The Coupling Tax: How Shared Token Budgets Undermine Visible Chain-of-Thought Under Fixed Output Limits cs.LG · 2026-05-08 · unverdicted · none · ref 5
Shared token budgets between visible chain-of-thought and answers create a coupling tax that makes non-thinking competitive on math benchmarks, with a truncation decomposition predicting the crossover and split budgets improving results.
LOFT: Low-Rank Orthogonal Fine-Tuning via Task-Aware Support Selection cs.LG · 2026-05-12 · unverdicted · none · ref 27
LOFT unifies orthogonal PEFT by treating adaptation as low-rank subspace rotation and adds task-aware support selection that improves efficiency under fixed budgets.
ProactBench: Beyond What The User Asked For cs.LG · 2026-05-09 · unverdicted · none · ref 82
ProactBench measures LLM conversational proactivity in three phases using 198 multi-agent dialogues and finds recovery behavior hard to predict from existing benchmarks.
Design and Report Benchmarks for Knowledge Work cs.AI · 2026-05-22 · unverdicted · none · ref 65
Proposes a three-step benchmark design method (define work activity, specify tested setting, score work product) derived from work studies and O*NET, demonstrated via three case analyses.
An Efficient Streaming Video Understanding Framework with Agentic Control cs.CV · 2026-05-18 · unverdicted · none · ref 41
R3-Streaming uses cascaded control, age-aware memory forgetting, and TB-GRPO reinforcement learning to reach SOTA scores on streaming video benchmarks while cutting visual token usage by 95-96%.
To See is Not to Learn: Protecting Multimodal Data from Unauthorized Fine-Tuning of Large Vision-Language Model cs.CR · 2026-05-14 · unverdicted · none · ref 48
MMGuard generates unlearnable multimodal examples via perturbations that exploit LVLM optimization shortcuts and disrupt cross-modal bindings, providing robust protection against unauthorized fine-tuning across threat models.
Principles and Guidelines for Randomized Controlled Trials in AI Evaluation cs.CY · 2026-05-03 · unverdicted · none · ref 4
The authors adapt established RCT validity principles from other fields into a standardized framework with 33 guidelines tailored to AI evaluation contexts.
torchtune: PyTorch native post-training library cs.LG · 2026-05-20 · unverdicted · none · ref 41
torchtune is a modular PyTorch library for LLM post-training that delivers competitive performance and memory efficiency while supporting rapid research iteration through hackable components.
LLARS: Enabling Domain Expert & Developer Collaboration for LLM Prompting, Generation and Evaluation cs.AI · 2026-05-11 · unverdicted · none · ref 12
LLARS is a new integrated platform that combines collaborative prompt authoring, cost-controlled batch generation, and hybrid evaluation to help domain experts and developers jointly build and assess LLM systems.
Unleashing Scalable Context Parallelism for Foundation Models Pre-Training via FCP cs.DC · 2026-05-08 · unverdicted · none · ref 57
FCP shards sequences at block level with flexible P2P communication and bin-packing to achieve near-linear scaling up to 256 GPUs and 1.13x-2.21x higher attention MFU in foundation model pre-training.
Retrieval-Conditioned Topology Selection with Provable Budget Conservation for Multi-Agent Code Generation cs.AI · 2026-05-07 · unverdicted · none · ref 19
RGAO combines retrieval-based complexity assessment with a formal budget algebra to enable dynamic topology selection in multi-agent code generation with provable conservation.
Chain of Thought Monitorability: A New and Fragile Opportunity for AI Safety cs.AI · 2025-07-15 · unverdicted · none · ref 39
Chain-of-thought monitorability provides a promising but fragile method for AI safety oversight that developers should actively preserve.

Title resolution pending

citation-role summary

citation-polarity summary

fields

years

verdicts

roles

polarities

representative citing papers

citing papers explorer