Introduces rewriting categories to formalize proof equivariance and success invariance, shows LLM provers violate both, and demonstrates test-time aggregation recovers invariance and boosts performance.
a is b” fail to learn “b is a
2 Pith papers cite this work. Polarity classification is still indexing.
2
Pith papers citing it
citation-role summary
background 1
citation-polarity summary
years
2026 2verdicts
UNVERDICTED 2roles
background 1polarities
support 1representative citing papers
DiffScore is a bidirectional masked-diffusion evaluation framework that measures text recoverability across masking rates and outperforms autoregressive baselines on ten benchmarks.
citing papers explorer
-
What are the Right Symmetries for Formal Theorem Proving?
Introduces rewriting categories to formalize proof equivariance and success invariance, shows LLM provers violate both, and demonstrates test-time aggregation recovers invariance and boosts performance.
-
DiffScore: Text Evaluation Beyond Autoregressive Likelihood
DiffScore is a bidirectional masked-diffusion evaluation framework that measures text recoverability across masking rates and outperforms autoregressive baselines on ten benchmarks.