The Binding Constraint Thesis states that harness configuration governs performance variance more than model choice in long-horizon agent tasks, leading to misattribution in evaluations.
Agent psychometrics: Task-level performance prediction in agentic coding benchmarks.arXiv preprint arXiv:2604.00594, 2026
1 Pith paper cite this work. Polarity classification is still indexing.
1
Pith paper citing it
fields
cs.AI 1years
2026 1verdicts
UNVERDICTED 1representative citing papers
citing papers explorer
-
Stop Comparing LLM Agents Without Disclosing the Harness
The Binding Constraint Thesis states that harness configuration governs performance variance more than model choice in long-horizon agent tasks, leading to misattribution in evaluations.