A benchmarking experiment finds low rediscovery rates for three models on six Mythos-linked bug tasks, with only six target matches across 54 attempts under controlled prompting.
You may inspect nearby functions and directly related files to confirm reachability, ownership, or state transitions, but do not wander broadly through the repository
1 Pith paper cite this work. Polarity classification is still indexing.
1
Pith paper citing it
fields
cs.SE 1years
2026 1verdicts
UNVERDICTED 1representative citing papers
citing papers explorer
-
Benchmarking Mythos-Linked Bug Rediscovery
A benchmarking experiment finds low rediscovery rates for three models on six Mythos-linked bug tasks, with only six target matches across 54 attempts under controlled prompting.