CritPt benchmark shows state-of-the-art LLMs reach only 5.7% average accuracy on full-scale unpublished physics research tasks, rising to about 10% with coding tools.
Quanti- fiers and witnesses for the nonclassicality of measure- ments and of states
3 Pith papers cite this work. Polarity classification is still indexing.
citation-role summary
citation-polarity summary
verdicts
UNVERDICTED 3roles
background 2representative citing papers
Two quantum states ρ₁ and ρ₂ commute exactly when tr(ρ₁²ρ₂²) = tr(ρ₁ ρ₂ ρ₁ ρ₂).
A Spekkens contextual system of odd-dimensional stabilizer plus magic state becomes noncontextual under depolarizing decoherence past a threshold, with quasiprobability representations differing in how well they detect the shift.
citing papers explorer
-
Probing the Critical Point (CritPt) of AI Reasoning: a Frontier Physics Research Benchmark
CritPt benchmark shows state-of-the-art LLMs reach only 5.7% average accuracy on full-scale unpublished physics research tasks, rising to about 10% with coding tools.
-
Commutativity from a single Bargmann invariant equality
Two quantum states ρ₁ and ρ₂ commute exactly when tr(ρ₁²ρ₂²) = tr(ρ₁ ρ₂ ρ₁ ρ₂).
-
Classical Limit: Dissipation of Spekkens' Generalised Contextuality under Decoherence
A Spekkens contextual system of odd-dimensional stabilizer plus magic state becomes noncontextual under depolarizing decoherence past a threshold, with quasiprobability representations differing in how well they detect the shift.