Turing Test Revisited: A Framework for an Alternative
Pith reviewed 2026-05-25 15:32 UTC · model grok-4.3
The pith
The Turing Test has a significant flaw that a generic framework based on subjective perception of intelligence can address for testing machines.
A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.
Core claim
While the core idea of the Turing Test may appear sound, detailed analysis of its requirements reveals a significant flaw; a systematic approach yields a plausible generic framework for testing intelligent machines that rests on categories of factors implied by subjective perception of intelligence.
What carries the argument
A generic framework based on categories of factors implied by subjective perception of intelligence, which organizes the requirements for machine intelligence tests.
If this is right
- Tests for intelligent machines can be devised by following the systematic approach outlined.
- The framework provides an alternative that avoids the identified flaw in the Turing Test.
- An evaluative discussion can highlight remaining issues that the framework leaves open.
- Advances in philosophy of mind and related fields can inform the categories used in the framework.
Where Pith is reading between the lines
- Applying the framework to existing AI systems could reveal whether current benchmarks align with human subjective judgments of intelligence.
- The categories might extend to hybrid human-machine systems or collective intelligence scenarios not addressed in the original Turing Test.
- Further work could test whether the framework produces consistent results across different cultural perceptions of intelligence.
Load-bearing premise
Subjective perception of intelligence supplies the correct and sufficient categories for constructing valid tests of machine intelligence.
What would settle it
A demonstration that no set of tests constructed from the proposed categories reliably separates machines judged intelligent by human observers from those judged non-intelligent would falsify the framework's validity.
Figures
read the original abstract
This paper aims to question the suitability of the Turing Test, for testing machine intelligence, in the light of advances made in the last 60 years in science, medicine, and philosophy of mind. While the main concept of the test may seem sound and valid, a detailed analysis of what is required to pass the test highlights a significant flow. Once the analysis of the test is presented, a systematic approach is followed in analysing what is needed to devise a test or tests for intelligent machines. The paper presents a plausible generic framework based on categories of factors implied by subjective perception of intelligence. An evaluative discussion concludes the paper highlighting some of the unaddressed issues within this generic framework.
Editorial analysis
A structured set of objections, weighed in public.
Referee Report
Summary. The paper questions the suitability of the Turing Test for evaluating machine intelligence in light of advances in science, medicine, and philosophy of mind over the past 60 years. It asserts that a detailed analysis of the requirements to pass the test reveals a significant flaw. The manuscript then describes a systematic approach to devising tests for intelligent machines and presents a plausible generic framework constructed from categories of factors implied by subjective perception of intelligence, concluding with an evaluative discussion of unaddressed issues in the proposed framework.
Significance. If the analysis of the Turing Test flaw holds and the generic framework provides a coherent alternative, the work could modestly contribute to discussions in AI evaluation by incorporating insights from philosophy of mind and related fields. The paper's explicit acknowledgment of limitations positions it as exploratory rather than conclusive, potentially serving as a prompt for more rigorous follow-up studies on alternative testing paradigms.
major comments (2)
- [Abstract] Abstract: the central claim that a 'detailed analysis ... highlights a significant flaw' is presented without any derivation steps, specific requirements of the test, or error analysis in the visible text, leaving the assertion unsupported by verifiable evidence and undermining evaluation of whether the flaw is load-bearing for rejecting the Turing Test.
- [Framework section] The section presenting the generic framework: the construction relies on 'categories of factors implied by subjective perception of intelligence' as the grounding axiom, but this introduces a circularity risk because the categories are derived from the subjective judgment the framework is intended to test, with no independent external benchmark or falsifiable criterion stated.
minor comments (2)
- [Abstract] Abstract: 'significant flow' appears to be a typographical error for 'significant flaw'.
- The manuscript would benefit from explicit section headings or numbered subsections to allow precise citation of the analysis of the Turing Test requirements and the framework construction.
Simulated Author's Rebuttal
We thank the referee for the constructive comments. We address each major point below, agreeing that clarifications and expansions are warranted to strengthen the presentation of the analysis and framework.
read point-by-point responses
-
Referee: [Abstract] Abstract: the central claim that a 'detailed analysis ... highlights a significant flaw' is presented without any derivation steps, specific requirements of the test, or error analysis in the visible text, leaving the assertion unsupported by verifiable evidence and undermining evaluation of whether the flaw is load-bearing for rejecting the Turing Test.
Authors: The abstract functions as a high-level summary; the full derivation of the Turing Test requirements, the specific criteria examined, and the resulting flaw identification appear in the dedicated analysis section of the manuscript. We agree the abstract could better indicate the presence of this supporting analysis and will revise it to include a concise reference to the key steps and the nature of the flaw. revision: yes
-
Referee: [Framework section] The section presenting the generic framework: the construction relies on 'categories of factors implied by subjective perception of intelligence' as the grounding axiom, but this introduces a circularity risk because the categories are derived from the subjective judgment the framework is intended to test, with no independent external benchmark or falsifiable criterion stated.
Authors: The framework is deliberately anchored in subjective perception to incorporate insights from philosophy of mind, as stated in the paper. The concluding evaluative discussion already flags several unaddressed issues in the framework. We accept that an explicit treatment of circularity risk, potential external benchmarks, and falsifiability would improve rigor and will add a targeted paragraph addressing these points while preserving the exploratory character of the work. revision: yes
Circularity Check
No significant circularity; framework is explicit construction from stated premise
full rationale
The paper performs an analysis of the Turing Test, identifies a flaw, then explicitly constructs a generic framework from categories implied by subjective perception of intelligence. This is presented as a plausible approach rather than a first-principles derivation or prediction that reduces to its inputs by construction. No equations, fitted parameters, or self-citations are invoked in a load-bearing manner that would force equivalence between claim and premise. The manuscript flags unaddressed issues, confirming the construction is not asserted as uniquely determined or tautological. The derivation chain remains self-contained against external benchmarks.
Axiom & Free-Parameter Ledger
axioms (2)
- domain assumption Advances in science, medicine, and philosophy of mind over the last 60 years render the Turing Test unsuitable
- ad hoc to paper Subjective perception of intelligence implies usable categories of factors for constructing tests
invented entities (1)
-
Generic framework for alternative intelligence tests
no independent evidence
Reference graph
Works this paper leans on
-
[1]
" write newline "" before.all 'output.state := FUNCTION fin.entry add.period write newline FUNCTION new.block output.state before.all = 'skip after.block 'output.state := if FUNCTION new.sentence output.state after.block = 'skip output.state before.all = 'skip after.sentence 'output.state := if if FUNCTION not #0 #1 if FUNCTION and 'skip pop #0 if FUNCTIO...
-
[2]
Alvarado, N., Adams, S. S., Burbeck, S., \ Latta, C. 2002 . Beyond the Turing test: performance metrics for evaluating a computer simulation of the human mind \ In Proc. 2nd International Conference on Development and Learning , \ 147--152
work page 2002
-
[3]
Ayesh, A. 2003 . Perception and Emotion Based Reasoning: A Connectionist Approach \ Informatica , 27\/ (2), 119--126
work page 2003
-
[4]
Ayesh, A. 2004 . Emotionally motivated reinforcement learning based controller \ In Proc. IEEE International Conference on Systems, Man and Cybernetics , 1, \ 874--878 vol.1
work page 2004
-
[5]
Ayesh, A., Stokes, J., \ Edwards, R. 2007a . Fuzzy Individual Model (FIM) for Realistic Crowd Simulation: Preliminary Results \ In Proc. IEEE International Fuzzy Systems Conference FUZZ-IEEE 2007 , \ 1--5
work page 2007
-
[6]
Ayesh, A., Thomas, S., Perill, S., \ Joseph, C. 2007b . Aesthetics of a Robot: Case Study on AIBO Dog Robots for Buddy-ing Devices \ De Montfort University, Technical Report (paper under review)
-
[7]
Blewitt, W. \ \ Ayesh, A. 2009 . Implementation of Millenson's Model of Emotions in a Game Environment \ In AISB Convention: AI and Games Symposium
work page 2009
-
[8]
Blewitt, W., Ayesh, A., John, R. I., \ Coupland, S. 2008 . A Millenson-based approach to emotion modelling \ In Proc. Conference on Human System Interactions , \ 491--496
work page 2008
-
[9]
Bringsjord, S., Noel, R., Caporale, C., Fetzer, J., Searle, J., Haugel, J., Rapaport, B., \ Hauser, L. 2000 . Animals, Zombanimals, and the Total Turing Test: The Essence of Artificial Intelligence \ Journal of Logic, Language, and Information , 9 , 397--418
work page 2000
-
[10]
Browne, R. 1991 . The Turing Test and non-information flow \ In Proc. IEEE Computer Society Symposium on Research in Security and Privacy , \ 373--385
work page 1991
-
[11]
Coates, A. L., Baird, H. S., \ Faternan, R. J. 2001 . Pessimal print: a reverse Turing test \ In Proc. Sixth International Conference on Document Analysis and Recognition , \ 1154--1158
work page 2001
-
[12]
Dowe, D. L. \ \ R, A. 1998 . A Non-behavioural, Computational Extension to the Turing Test \ In In International Conference on Computational Intelligence & Multimedia Applications (ICCIMA '98 , \ 101--106
work page 1998
-
[13]
Downs, Roger M. Stea, D. 1977 . Maps in Minds: Reflections on Cognitive Mapping . Harper & Row Publishers
work page 1977
-
[14]
D., Heishman, R., Li, F., Rosenfeld, A., Schoelles, M
Duric, Z., Gray, W. D., Heishman, R., Li, F., Rosenfeld, A., Schoelles, M. J., Schunn, C., \ Wechsler, H. 2002 . Integrating perceptual and cognitive modeling for adaptive and intelligent human-computer interaction \ Proceedings of the IEEE , 90\/ (7), 1272--1289
work page 2002
-
[15]
Gross, R. D. 1992 . Psychology: The Science of Mind and Behaviour . Hodder & Stoughton., London, UK
work page 1992
-
[16]
Harnad, S. 1992 . The Turing Test is not a Trick: Turing Indistinguishability is a Scientific Criterion \ SIGART Bulletin , 3 , 9--10
work page 1992
-
[17]
Hernandez-orallo, J. 1999 . Beyond the Turing Test \ Journal of Logic, Language and Information , 9 , 2000
work page 1999
-
[18]
Hicks, M. 2008 . Repurposing Turing's "Human Brake" \ IEEE Annals of the History of Computing , 30\/ (4), 108--108
work page 2008
-
[19]
Kaber, D. B., Wang, X., \ Kim, S.-H. 2006 . Computational Cognitive Modeling of Operator Behavior in Telerover Navigation \ In Proc. IEEE International Conference on Systems, Man and Cybernetics SMC '06 , 4, \ 3210--3215
work page 2006
-
[20]
Kolupaev, A. \ \ Ogijenko, J. 2008 . CAPTCHAs: Humans vs. Bots \ IEEE Security & Privacy , 6\/ (1), 68--70
work page 2008
-
[21]
Krol, M. 1999 . Have we witnessed a real-life Turing Test? \ Computer , 32\/ (3), 27--30
work page 1999
-
[22]
Landwehr, C. E. 2008 . Cybersecurity and Artificial Intelligence: From Fixing the Plumbing to Smart Water \ IEEE Security & Privacy , 6\/ (5), 3--4
work page 2008
-
[23]
Laufmann, S. C. 1997 . Towards agent-based software engineering for information-dependent enterprise applications \ Software Engineering. IEE Proceedings- , 144\/ (1), 38--50
work page 1997
-
[24]
Longo, G. 2009 . From exact sciences to life phenomena: Following Schr\" o dinger and Turing on Programs, Life and Causality \ Information and Computation , 207\/ (5), 545--558
work page 2009
-
[25]
Mueller, S. T. \ \ Minnery, B. S. 2008 . Adapting the Turing Test for Embodied Neurocognitive Evaluation of Biologically-Inspired cognitive agents. \ In The AAAI Fall symposium on Biologically Inspired Cognitive Architectures, Washington D.C. AAAI
work page 2008
-
[26]
Pope, C. \ \ Kaur, K. 2005 . Is it human or computer? Defending e-commerce with Captchas \ IT Professional , 7\/ (2), 43--49
work page 2005
-
[27]
Shirali Shahreza, M. H. \ \ Shirali Shahreza, M. 2008 . An Anti-SMS-Spam Using CAPTCHA \ In Proc. ISECS International Colloquium on Computing, Communication, Control, and Management CCCM '08 , 2, \ 318--321
work page 2008
-
[28]
Shirali-Shahreza, S. \ \ Movaghar, A. 2007 . A New Anti-Spam Protocol Using CAPTCHA \ In Proc. IEEE International Conference on Networking, Sensing and Control , \ 234--238
work page 2007
-
[29]
Shirali-Shahreza, S. \ \ Shirali-Shahreza, M. 2008 . CAPTCHA for children \ In Proc. IEEE International Conference on System of Systems Engineering SoSE '08 , \ 1--6
work page 2008
-
[30]
Solso, R. L. 2001 . Cognitive Psychology . Allyn & Bacon, A Pearson Education Company., Boston
work page 2001
-
[31]
St, N. T., Knauf, R., \ Gonzalez, A. J. 1997 . Estimating an AI System's Validity by a TURING Test \ In 42nd International Scientific Colloquium, Ilmenau University of Technology , \ 65--70
work page 1997
-
[32]
Strongman, K. T. 2000 . The Psychology of Emotion: Theories of Emotion in Perspective . Chichester: John Wiley & Sons
work page 2000
-
[33]
K., Ilmenau, T., Ia, F., \ Gonzalez, A
Tu, R. K., Ilmenau, T., Ia, F., \ Gonzalez, A. J. 1997 . A TURING Test Approach to Intelligent System Validation \ In LIT--97, Proc. 5. Leipziger Informatik-Tage , \ 71--76. FIT
work page 1997
-
[34]
Turing, A. 1948 . The Essential Turing: The ideas that gave birth to the computer age; ed. Copeland, B. Jack , \ Machine Intelligence. ISBN 0-19-825080-0. ISBN 0-19-825080-0
work page 1948
-
[35]
Turing, A. 1950 . Computing Machinery and Intelligence \ Mind , 236 , 433 -- 460
work page 1950
-
[36]
Turing, A. 1952 . The Essential Turing: The ideas that gave birth to the computer age; ed. Copeland, B. Jack , \ Can Automatic Calculating Machines be Said to Think? ISBN 0-19-825080-0. ISBN 0-19-825080-0
work page 1952
-
[37]
Ventura, R. \ \ Pinto-Ferreira, C. 2007 . Indexing by Metric Adaptation and Representation Upgrade in an Emotion-based Agent Model \ In Proc. Third International Conference on Natural Computation ICNC 2007 , 2, \ 108--112
work page 2007
-
[38]
Wooldridge, M. 1997 . Agent-based software engineering \ Software Engineering. IEE Proceedings- , 144\/ (1), 26--37
work page 1997
-
[39]
Zheng, M. \ \ Alagar, V. S. 2005 . Conformance testing of BDI properties in agent-based software \ In Proc. 12th Asia-Pacific Software Engineering Conference APSEC '05 , \ 8 pp.--
work page 2005
discussion (0)
Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.