ValueBlindBench is a preregistered agreement-gated stress-test protocol for deciding when LLM-judged investment-rationale claims are stable enough to report, using 1,100 trajectories and 5,500 judge calls to gate claims by weighted kappa agreement.
Krippendorff, Content Analysis: An Introduction to Its Methodology, 4th ed
1 Pith paper cite this work. Polarity classification is still indexing.
1
Pith paper citing it
fields
cs.AI 1years
2026 1verdicts
UNVERDICTED 1representative citing papers
citing papers explorer
-
ValueBlindBench: Agreement-Gated Stress Testing of LLM-Judged Investment Rationales Before Returns Are Observable
ValueBlindBench is a preregistered agreement-gated stress-test protocol for deciding when LLM-judged investment-rationale claims are stable enough to report, using 1,100 trajectories and 5,500 judge calls to gate claims by weighted kappa agreement.