pith. sign in

arxiv: 1906.11068 · v1 · pith:OVUTLT7Anew · submitted 2019-06-26 · 💻 cs.AI

Turing Test Revisited: A Framework for an Alternative

Pith reviewed 2026-05-25 15:32 UTC · model grok-4.3

classification 💻 cs.AI
keywords Turing Testmachine intelligenceintelligence testingframeworksubjective perceptionAI evaluationphilosophy of mind
0
0 comments X

The pith

The Turing Test has a significant flaw that a generic framework based on subjective perception of intelligence can address for testing machines.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

The paper questions the suitability of the Turing Test for machine intelligence in light of advances in science, medicine, and philosophy of mind. It conducts a detailed analysis of what passing the test requires and identifies a flaw in the original approach. A systematic method is then used to determine the elements needed for valid tests of intelligent machines. From this, the paper constructs a plausible generic framework drawing on categories of factors that subjective perception of intelligence implies. The work closes with an evaluative discussion of issues still unaddressed in the proposed framework.

Core claim

While the core idea of the Turing Test may appear sound, detailed analysis of its requirements reveals a significant flaw; a systematic approach yields a plausible generic framework for testing intelligent machines that rests on categories of factors implied by subjective perception of intelligence.

What carries the argument

A generic framework based on categories of factors implied by subjective perception of intelligence, which organizes the requirements for machine intelligence tests.

If this is right

  • Tests for intelligent machines can be devised by following the systematic approach outlined.
  • The framework provides an alternative that avoids the identified flaw in the Turing Test.
  • An evaluative discussion can highlight remaining issues that the framework leaves open.
  • Advances in philosophy of mind and related fields can inform the categories used in the framework.

Where Pith is reading between the lines

These are editorial extensions of the paper, not claims the author makes directly.

  • Applying the framework to existing AI systems could reveal whether current benchmarks align with human subjective judgments of intelligence.
  • The categories might extend to hybrid human-machine systems or collective intelligence scenarios not addressed in the original Turing Test.
  • Further work could test whether the framework produces consistent results across different cultural perceptions of intelligence.

Load-bearing premise

Subjective perception of intelligence supplies the correct and sufficient categories for constructing valid tests of machine intelligence.

What would settle it

A demonstration that no set of tests constructed from the proposed categories reliably separates machines judged intelligent by human observers from those judged non-intelligent would falsify the framework's validity.

Figures

Figures reproduced from arXiv: 1906.11068 by Aladdin Ayesh.

Figure 1
Figure 1. Figure 1: Human Intelligence Engine of Irrationality [PITH_FULL_IMAGE:figures/full_fig_p006_1.png] view at source ↗
Figure 2
Figure 2. Figure 2: Attempt to Capture the Factors of Intelligence [PITH_FULL_IMAGE:figures/full_fig_p010_2.png] view at source ↗
Figure 3
Figure 3. Figure 3: The 4 E’s Framework The hierarchy presented in figure 3 is important and it enables the consid￾eration of intelligence typology. The simplest of intelligence presence will be apparent through the ability of experience. Animals and human alike can expe￾10 [PITH_FULL_IMAGE:figures/full_fig_p010_3.png] view at source ↗
read the original abstract

This paper aims to question the suitability of the Turing Test, for testing machine intelligence, in the light of advances made in the last 60 years in science, medicine, and philosophy of mind. While the main concept of the test may seem sound and valid, a detailed analysis of what is required to pass the test highlights a significant flow. Once the analysis of the test is presented, a systematic approach is followed in analysing what is needed to devise a test or tests for intelligent machines. The paper presents a plausible generic framework based on categories of factors implied by subjective perception of intelligence. An evaluative discussion concludes the paper highlighting some of the unaddressed issues within this generic framework.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Referee Report

2 major / 2 minor

Summary. The paper questions the suitability of the Turing Test for evaluating machine intelligence in light of advances in science, medicine, and philosophy of mind over the past 60 years. It asserts that a detailed analysis of the requirements to pass the test reveals a significant flaw. The manuscript then describes a systematic approach to devising tests for intelligent machines and presents a plausible generic framework constructed from categories of factors implied by subjective perception of intelligence, concluding with an evaluative discussion of unaddressed issues in the proposed framework.

Significance. If the analysis of the Turing Test flaw holds and the generic framework provides a coherent alternative, the work could modestly contribute to discussions in AI evaluation by incorporating insights from philosophy of mind and related fields. The paper's explicit acknowledgment of limitations positions it as exploratory rather than conclusive, potentially serving as a prompt for more rigorous follow-up studies on alternative testing paradigms.

major comments (2)
  1. [Abstract] Abstract: the central claim that a 'detailed analysis ... highlights a significant flaw' is presented without any derivation steps, specific requirements of the test, or error analysis in the visible text, leaving the assertion unsupported by verifiable evidence and undermining evaluation of whether the flaw is load-bearing for rejecting the Turing Test.
  2. [Framework section] The section presenting the generic framework: the construction relies on 'categories of factors implied by subjective perception of intelligence' as the grounding axiom, but this introduces a circularity risk because the categories are derived from the subjective judgment the framework is intended to test, with no independent external benchmark or falsifiable criterion stated.
minor comments (2)
  1. [Abstract] Abstract: 'significant flow' appears to be a typographical error for 'significant flaw'.
  2. The manuscript would benefit from explicit section headings or numbered subsections to allow precise citation of the analysis of the Turing Test requirements and the framework construction.

Simulated Author's Rebuttal

2 responses · 0 unresolved

We thank the referee for the constructive comments. We address each major point below, agreeing that clarifications and expansions are warranted to strengthen the presentation of the analysis and framework.

read point-by-point responses
  1. Referee: [Abstract] Abstract: the central claim that a 'detailed analysis ... highlights a significant flaw' is presented without any derivation steps, specific requirements of the test, or error analysis in the visible text, leaving the assertion unsupported by verifiable evidence and undermining evaluation of whether the flaw is load-bearing for rejecting the Turing Test.

    Authors: The abstract functions as a high-level summary; the full derivation of the Turing Test requirements, the specific criteria examined, and the resulting flaw identification appear in the dedicated analysis section of the manuscript. We agree the abstract could better indicate the presence of this supporting analysis and will revise it to include a concise reference to the key steps and the nature of the flaw. revision: yes

  2. Referee: [Framework section] The section presenting the generic framework: the construction relies on 'categories of factors implied by subjective perception of intelligence' as the grounding axiom, but this introduces a circularity risk because the categories are derived from the subjective judgment the framework is intended to test, with no independent external benchmark or falsifiable criterion stated.

    Authors: The framework is deliberately anchored in subjective perception to incorporate insights from philosophy of mind, as stated in the paper. The concluding evaluative discussion already flags several unaddressed issues in the framework. We accept that an explicit treatment of circularity risk, potential external benchmarks, and falsifiability would improve rigor and will add a targeted paragraph addressing these points while preserving the exploratory character of the work. revision: yes

Circularity Check

0 steps flagged

No significant circularity; framework is explicit construction from stated premise

full rationale

The paper performs an analysis of the Turing Test, identifies a flaw, then explicitly constructs a generic framework from categories implied by subjective perception of intelligence. This is presented as a plausible approach rather than a first-principles derivation or prediction that reduces to its inputs by construction. No equations, fitted parameters, or self-citations are invoked in a load-bearing manner that would force equivalence between claim and premise. The manuscript flags unaddressed issues, confirming the construction is not asserted as uniquely determined or tautological. The derivation chain remains self-contained against external benchmarks.

Axiom & Free-Parameter Ledger

0 free parameters · 2 axioms · 1 invented entities

The proposal rests on untested philosophical premises about intelligence and perception with no external benchmarks, data, or formal derivations supplied in the abstract.

axioms (2)
  • domain assumption Advances in science, medicine, and philosophy of mind over the last 60 years render the Turing Test unsuitable
    Stated as the starting point for questioning the test's suitability.
  • ad hoc to paper Subjective perception of intelligence implies usable categories of factors for constructing tests
    Directly invoked as the basis for the generic framework.
invented entities (1)
  • Generic framework for alternative intelligence tests no independent evidence
    purpose: To replace or supplement the Turing Test using subjective-perception categories
    Introduced in the abstract as the main constructive contribution; no independent evidence or falsifiable prediction is provided.

pith-pipeline@v0.9.0 · 5627 in / 1131 out tokens · 27982 ms · 2026-05-25T15:32:26.996827+00:00 · methodology

discussion (0)

Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.

Reference graph

Works this paper leans on

39 extracted references · 39 canonical work pages

  1. [1]

    write newline

    " write newline "" before.all 'output.state := FUNCTION fin.entry add.period write newline FUNCTION new.block output.state before.all = 'skip after.block 'output.state := if FUNCTION new.sentence output.state after.block = 'skip output.state before.all = 'skip after.sentence 'output.state := if if FUNCTION not #0 #1 if FUNCTION and 'skip pop #0 if FUNCTIO...

  2. [2]

    S., Burbeck, S., \ Latta, C

    Alvarado, N., Adams, S. S., Burbeck, S., \ Latta, C. 2002 . Beyond the Turing test: performance metrics for evaluating a computer simulation of the human mind \ In Proc. 2nd International Conference on Development and Learning , \ 147--152

  3. [3]

    Ayesh, A. 2003 . Perception and Emotion Based Reasoning: A Connectionist Approach \ Informatica , 27\/ (2), 119--126

  4. [4]

    Ayesh, A. 2004 . Emotionally motivated reinforcement learning based controller \ In Proc. IEEE International Conference on Systems, Man and Cybernetics , 1, \ 874--878 vol.1

  5. [5]

    Ayesh, A., Stokes, J., \ Edwards, R. 2007a . Fuzzy Individual Model (FIM) for Realistic Crowd Simulation: Preliminary Results \ In Proc. IEEE International Fuzzy Systems Conference FUZZ-IEEE 2007 , \ 1--5

  6. [6]

    Ayesh, A., Thomas, S., Perill, S., \ Joseph, C. 2007b . Aesthetics of a Robot: Case Study on AIBO Dog Robots for Buddy-ing Devices \ De Montfort University, Technical Report (paper under review)

  7. [7]

    \ \ Ayesh, A

    Blewitt, W. \ \ Ayesh, A. 2009 . Implementation of Millenson's Model of Emotions in a Game Environment \ In AISB Convention: AI and Games Symposium

  8. [8]

    I., \ Coupland, S

    Blewitt, W., Ayesh, A., John, R. I., \ Coupland, S. 2008 . A Millenson-based approach to emotion modelling \ In Proc. Conference on Human System Interactions , \ 491--496

  9. [9]

    Bringsjord, S., Noel, R., Caporale, C., Fetzer, J., Searle, J., Haugel, J., Rapaport, B., \ Hauser, L. 2000 . Animals, Zombanimals, and the Total Turing Test: The Essence of Artificial Intelligence \ Journal of Logic, Language, and Information , 9 , 397--418

  10. [10]

    Browne, R. 1991 . The Turing Test and non-information flow \ In Proc. IEEE Computer Society Symposium on Research in Security and Privacy , \ 373--385

  11. [11]

    L., Baird, H

    Coates, A. L., Baird, H. S., \ Faternan, R. J. 2001 . Pessimal print: a reverse Turing test \ In Proc. Sixth International Conference on Document Analysis and Recognition , \ 1154--1158

  12. [12]

    Dowe, D. L. \ \ R, A. 1998 . A Non-behavioural, Computational Extension to the Turing Test \ In In International Conference on Computational Intelligence & Multimedia Applications (ICCIMA '98 , \ 101--106

  13. [13]

    Downs, Roger M. Stea, D. 1977 . Maps in Minds: Reflections on Cognitive Mapping . Harper & Row Publishers

  14. [14]

    D., Heishman, R., Li, F., Rosenfeld, A., Schoelles, M

    Duric, Z., Gray, W. D., Heishman, R., Li, F., Rosenfeld, A., Schoelles, M. J., Schunn, C., \ Wechsler, H. 2002 . Integrating perceptual and cognitive modeling for adaptive and intelligent human-computer interaction \ Proceedings of the IEEE , 90\/ (7), 1272--1289

  15. [15]

    Gross, R. D. 1992 . Psychology: The Science of Mind and Behaviour . Hodder & Stoughton., London, UK

  16. [16]

    Harnad, S. 1992 . The Turing Test is not a Trick: Turing Indistinguishability is a Scientific Criterion \ SIGART Bulletin , 3 , 9--10

  17. [17]

    Hernandez-orallo, J. 1999 . Beyond the Turing Test \ Journal of Logic, Language and Information , 9 , 2000

  18. [18]

    Human Brake

    Hicks, M. 2008 . Repurposing Turing's "Human Brake" \ IEEE Annals of the History of Computing , 30\/ (4), 108--108

  19. [19]

    B., Wang, X., \ Kim, S.-H

    Kaber, D. B., Wang, X., \ Kim, S.-H. 2006 . Computational Cognitive Modeling of Operator Behavior in Telerover Navigation \ In Proc. IEEE International Conference on Systems, Man and Cybernetics SMC '06 , 4, \ 3210--3215

  20. [20]

    \ \ Ogijenko, J

    Kolupaev, A. \ \ Ogijenko, J. 2008 . CAPTCHAs: Humans vs. Bots \ IEEE Security & Privacy , 6\/ (1), 68--70

  21. [21]

    Krol, M. 1999 . Have we witnessed a real-life Turing Test? \ Computer , 32\/ (3), 27--30

  22. [22]

    Landwehr, C. E. 2008 . Cybersecurity and Artificial Intelligence: From Fixing the Plumbing to Smart Water \ IEEE Security & Privacy , 6\/ (5), 3--4

  23. [23]

    Laufmann, S. C. 1997 . Towards agent-based software engineering for information-dependent enterprise applications \ Software Engineering. IEE Proceedings- , 144\/ (1), 38--50

  24. [24]

    Longo, G. 2009 . From exact sciences to life phenomena: Following Schr\" o dinger and Turing on Programs, Life and Causality \ Information and Computation , 207\/ (5), 545--558

  25. [25]

    Mueller, S. T. \ \ Minnery, B. S. 2008 . Adapting the Turing Test for Embodied Neurocognitive Evaluation of Biologically-Inspired cognitive agents. \ In The AAAI Fall symposium on Biologically Inspired Cognitive Architectures, Washington D.C. AAAI

  26. [26]

    \ \ Kaur, K

    Pope, C. \ \ Kaur, K. 2005 . Is it human or computer? Defending e-commerce with Captchas \ IT Professional , 7\/ (2), 43--49

  27. [27]

    Shirali Shahreza, M. H. \ \ Shirali Shahreza, M. 2008 . An Anti-SMS-Spam Using CAPTCHA \ In Proc. ISECS International Colloquium on Computing, Communication, Control, and Management CCCM '08 , 2, \ 318--321

  28. [28]

    \ \ Movaghar, A

    Shirali-Shahreza, S. \ \ Movaghar, A. 2007 . A New Anti-Spam Protocol Using CAPTCHA \ In Proc. IEEE International Conference on Networking, Sensing and Control , \ 234--238

  29. [29]

    \ \ Shirali-Shahreza, M

    Shirali-Shahreza, S. \ \ Shirali-Shahreza, M. 2008 . CAPTCHA for children \ In Proc. IEEE International Conference on System of Systems Engineering SoSE '08 , \ 1--6

  30. [30]

    Solso, R. L. 2001 . Cognitive Psychology . Allyn & Bacon, A Pearson Education Company., Boston

  31. [31]

    T., Knauf, R., \ Gonzalez, A

    St, N. T., Knauf, R., \ Gonzalez, A. J. 1997 . Estimating an AI System's Validity by a TURING Test \ In 42nd International Scientific Colloquium, Ilmenau University of Technology , \ 65--70

  32. [32]

    Strongman, K. T. 2000 . The Psychology of Emotion: Theories of Emotion in Perspective . Chichester: John Wiley & Sons

  33. [33]

    K., Ilmenau, T., Ia, F., \ Gonzalez, A

    Tu, R. K., Ilmenau, T., Ia, F., \ Gonzalez, A. J. 1997 . A TURING Test Approach to Intelligent System Validation \ In LIT--97, Proc. 5. Leipziger Informatik-Tage , \ 71--76. FIT

  34. [34]

    Turing, A. 1948 . The Essential Turing: The ideas that gave birth to the computer age; ed. Copeland, B. Jack , \ Machine Intelligence. ISBN 0-19-825080-0. ISBN 0-19-825080-0

  35. [35]

    Turing, A. 1950 . Computing Machinery and Intelligence \ Mind , 236 , 433 -- 460

  36. [36]

    Turing, A. 1952 . The Essential Turing: The ideas that gave birth to the computer age; ed. Copeland, B. Jack , \ Can Automatic Calculating Machines be Said to Think? ISBN 0-19-825080-0. ISBN 0-19-825080-0

  37. [37]

    \ \ Pinto-Ferreira, C

    Ventura, R. \ \ Pinto-Ferreira, C. 2007 . Indexing by Metric Adaptation and Representation Upgrade in an Emotion-based Agent Model \ In Proc. Third International Conference on Natural Computation ICNC 2007 , 2, \ 108--112

  38. [38]

    Wooldridge, M. 1997 . Agent-based software engineering \ Software Engineering. IEE Proceedings- , 144\/ (1), 26--37

  39. [39]

    \ \ Alagar, V

    Zheng, M. \ \ Alagar, V. S. 2005 . Conformance testing of BDI properties in agent-based software \ In Proc. 12th Asia-Pacific Software Engineering Conference APSEC '05 , \ 8 pp.--