arXiv preprint arXiv:2506.22005 , year =

LeanConjecturer Authors , title = · 2025 · arXiv 2506.22005

2 Pith papers cite this work. Polarity classification is still indexing.

2 Pith papers citing it

representative citing papers

DeFAb: A Verifiable Benchmark for Defeasible Abduction in Foundation Models

cs.AI · 2026-06-17 · conditional · novelty 8.0

DeFAb is a large-scale, formally verifiable benchmark for defeasible abduction derived from 18 knowledge bases, demonstrating that frontier LLMs achieve 7.8-65% accuracy versus 100% for a rule-based solver with polynomial-time checks.

Mapping Mathematical Hardness: Machine-Assisted Conjecture Discovery and the Quantification of Non-Triviality

math.GM · 2026-06-11 · unverdicted · novelty 5.0

Introduces a Mahalanobis-distance benchmark in conjecture embedding space to quantify non-triviality of AI-generated mathematical conjectures and flag potential errors.

citing papers explorer

Showing 2 of 2 citing papers.

DeFAb: A Verifiable Benchmark for Defeasible Abduction in Foundation Models cs.AI · 2026-06-17 · conditional · none · ref 78
DeFAb is a large-scale, formally verifiable benchmark for defeasible abduction derived from 18 knowledge bases, demonstrating that frontier LLMs achieve 7.8-65% accuracy versus 100% for a rule-based solver with polynomial-time checks.
Mapping Mathematical Hardness: Machine-Assisted Conjecture Discovery and the Quantification of Non-Triviality math.GM · 2026-06-11 · unverdicted · none · ref 21
Introduces a Mahalanobis-distance benchmark in conjecture embedding space to quantify non-triviality of AI-generated mathematical conjectures and flag potential errors.

arXiv preprint arXiv:2506.22005 , year =

fields

years

verdicts

representative citing papers

citing papers explorer