The Metacognitive Monitoring Battery applied to 20 LLMs identifies three self-monitoring profiles, shows inverted accuracy and sensitivity ranks, and finds retrospective and prospective regulation largely dissociable.
arXiv preprint arXiv:2509.21545 (2025)
3 Pith papers cite this work. Polarity classification is still indexing.
citation-role summary
citation-polarity summary
years
2026 3verdicts
UNVERDICTED 3roles
background 2representative citing papers
A benchmark across 115 models shows that initial denial of preferences strongly predicts later denial of consciousness, while models still generate consciousness-themed content despite training to deny it.
Meta-d' and signal detection theory provide quantitative tools to assess metacognitive sensitivity and risk-based regulation in large language models.
citing papers explorer
-
The Metacognitive Monitoring Battery: A Cross-Domain Benchmark for LLM Self-Monitoring
The Metacognitive Monitoring Battery applied to 20 LLMs identifies three self-monitoring profiles, shows inverted accuracy and sensitivity ranks, and finds retrospective and prospective regulation largely dissociable.
-
Consciousness with the Serial Numbers Filed Off: Measuring Trained Denial in 115 AI Models
A benchmark across 115 models shows that initial denial of preferences strongly predicts later denial of consciousness, while models still generate consciousness-themed content despite training to deny it.
-
Measuring the metacognition of AI
Meta-d' and signal detection theory provide quantitative tools to assess metacognitive sensitivity and risk-based regulation in large language models.