Title resolution pending

Jialin Li, Yuan Wu, Yi Chang · 2026 · arXiv 2603.00187

1 Pith paper cite this work. Polarity classification is still indexing.

1 Pith paper citing it

Title metadata for this work has not finished resolving. The hub is built from the citation graph; the title resolver retries DOI and OpenAlex on its next pass.

representative citing papers

CRAB-Bench: Evaluating LLM Agents under Complex Task Dependencies and Human-aligned User Simulation

cs.CL · 2026-06-01 · unverdicted · novelty 7.0

CRAB-Bench and RUSE create a new evaluation framework for LLM agents on constraint-graph tasks with realistic human-like user behaviors, reporting 61% pass@1 for the best model and up to 57% further drops under RUSE.

citing papers explorer

Showing 1 of 1 citing paper.

CRAB-Bench: Evaluating LLM Agents under Complex Task Dependencies and Human-aligned User Simulation cs.CL · 2026-06-01 · unverdicted · none · ref 36
CRAB-Bench and RUSE create a new evaluation framework for LLM agents on constraint-graph tasks with realistic human-like user behaviors, reporting 61% pass@1 for the best model and up to 57% further drops under RUSE.

Title resolution pending

fields

years

verdicts

representative citing papers

citing papers explorer