Outcome evidence improves LLM accuracy on scientific feasibility assessment more consistently than experiment descriptions, which introduce brittleness under partial context.
NeurIPS 2023 Workshop on Instruction Tuning and Instruction Following , year=
1 Pith paper cite this work. Polarity classification is still indexing.
1
Pith paper citing it
fields
cs.CL 1years
2026 1verdicts
UNVERDICTED 1representative citing papers
citing papers explorer
-
Experiments or Outcomes? Probing Scientific Feasibility in Large Language Models
Outcome evidence improves LLM accuracy on scientific feasibility assessment more consistently than experiment descriptions, which introduce brittleness under partial context.