Transactions of the association for computational linguistics12, 157–173 (2024)

Liu, N · 2024

3 Pith papers cite this work. Polarity classification is still indexing.

3 Pith papers citing it

browse 3 citing papers

representative citing papers

SciEval: A Benchmark for Automatic Evaluation of K-12 Science Instructional Materials

cs.AI · 2026-04-28 · unverdicted · novelty 7.0

SciEval is a new benchmark of expert-annotated K-12 science lessons for LLM-based automatic evaluation, where zero-shot models perform poorly but fine-tuning yields up to 11% gains.

Teaching an Agent to Sketch One Part at a Time

cs.AI · 2026-03-19 · unverdicted · novelty 6.0

A multi-modal LM agent is trained to produce vector sketches part-by-part via supervised fine-tuning and process-reward RL on the new ControlSketch-Part dataset with automatic part annotations.

An Empirical Study of Multi-Agent Collaboration for Automated Research

cs.MA · 2026-03-31 · unverdicted · novelty 5.0

Subagent architectures deliver stable high-throughput optimization under tight time limits while agent teams enable deeper refactoring at the cost of higher fragility.

citing papers explorer

Showing 3 of 3 citing papers.

SciEval: A Benchmark for Automatic Evaluation of K-12 Science Instructional Materials cs.AI · 2026-04-28 · unverdicted · none · ref 21
SciEval is a new benchmark of expert-annotated K-12 science lessons for LLM-based automatic evaluation, where zero-shot models perform poorly but fine-tuning yields up to 11% gains.
Teaching an Agent to Sketch One Part at a Time cs.AI · 2026-03-19 · unverdicted · none · ref 20
A multi-modal LM agent is trained to produce vector sketches part-by-part via supervised fine-tuning and process-reward RL on the new ControlSketch-Part dataset with automatic part annotations.
An Empirical Study of Multi-Agent Collaboration for Automated Research cs.MA · 2026-03-31 · unverdicted · none · ref 6
Subagent architectures deliver stable high-throughput optimization under tight time limits while agent teams enable deeper refactoring at the cost of higher fragility.

Transactions of the association for computational linguistics12, 157–173 (2024)

fields

years

verdicts

representative citing papers

citing papers explorer