Classification Schemas for Artificial Intelligence Failures

Peter J. Scott; Roman V. Yampolskiy

arxiv: 1907.07771 · v1 · pith:J62SPG23new · submitted 2019-07-15 · 💻 cs.CY

Classification Schemas for Artificial Intelligence Failures

Peter J. Scott , Roman V. Yampolskiy This is my paper

Pith reviewed 2026-05-24 21:08 UTC · model grok-4.3

classification 💻 cs.CY

keywords AI failuresclassification schemeartificial intelligence safetyrisk assessmentfailure categorizationdevelopment lifecyclehistorical analysis

0 comments

The pith

Classifying historical AI failures can simplify responses to future failures and support risk assessments in development.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

This paper reviews past instances of artificial intelligence systems failing and develops a way to group them into categories. The authors propose that having such categories makes it easier to decide what to do when new failures occur. They also argue that using these categories to check for risks while developing AI could help avoid some problems before they happen. The goal is to move from reacting to each failure individually to having a structured approach based on patterns from history.

Core claim

The authors examine historical failures of artificial intelligence and propose a classification scheme for categorizing future failures. This scheme is intended to simplify the choice of response to future failures and to allow development lifecycles to be augmented with targeted risk assessments, ultimately reducing the number of future failures.

What carries the argument

A classification scheme for AI failures based on historical examples that organizes them to guide responses and risk assessments.

If this is right

Future AI failures can be responded to more efficiently by matching them to known categories.
AI development processes can incorporate specific risk assessments derived from the failure categories.
Overall incidence of AI failures may decrease due to preventive measures informed by the classification.

Where Pith is reading between the lines

These are editorial extensions of the paper, not claims the author makes directly.

The scheme might require periodic updates to cover failure modes from new AI capabilities absent in the historical record.
It could serve as a template for creating similar taxonomies in related domains such as autonomous systems or machine learning security.
Adoption in industry standards might lead to more uniform safety practices across different organizations.

Load-bearing premise

That a classification derived from historical AI failures will generalize usefully to future failures whose causes and contexts may differ substantially from the examined cases.

What would settle it

A series of new AI failures occurring that do not fit the proposed categories and for which the classification does not simplify the choice of response.

Figures

Figures reproduced from arXiv: 1907.07771 by Peter J. Scott, Roman V. Yampolskiy.

**Figure 2.** Figure 2: Near- and Far-Term AI Failure Scenarios [PITH_FULL_IMAGE:figures/full_fig_p010_2.png] view at source ↗

read the original abstract

In this paper we examine historical failures of artificial intelligence (AI) and propose a classification scheme for categorizing future failures. By doing so we hope that (a) the responses to future failures can be improved through applying a systematic classification that can be used to simplify the choice of response and (b) future failures can be reduced through augmenting development lifecycles with targeted risk assessments.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Desk Editor's Note private letter to a colleague

The paper proposes a taxonomy of AI failures from historical examples to guide responses and risk assessments, but supplies no test of whether the categories transfer to new systems or actually simplify anything.

read the letter

The core offering is a classification scheme built from past AI incidents, with the claim that systematic categories will make response choices easier and let developers target risk checks during building. It reviews some historical cases and organizes them into groups, which at least collects the examples in one place and gives names to patterns people already talk about informally. That part is straightforward and could serve as a reference list for anyone new to the area. The paper does not claim to derive the schema from first principles or reduce existing work; it presents it as an examination plus proposal. No equations or fitted models appear, and the argument stays at the level of logical suggestion rather than measured outcome. The main weakness is the missing link between the proposed categories and any demonstrated benefit. The authors assert that the schema will reduce future failures through better risk assessments, yet they give no hold-out check, no comparison against existing taxonomies, and no forward-looking examples to show coverage of post-2019 systems. The stress-test point about distribution shift lands: categories fitted to earlier narrow AI may miss emergent behaviors in large models or different deployment settings, and the paper offers no update rule or validation step. This is the sort of organizing paper that can start a conversation in AI safety or reliability groups, but it will not change methods on its own. Readers already working on failure taxonomies might skim it for the collected examples; others can skip it. It is coherent on its own terms and engages the literature without internal contradictions, so it clears the bar for a serious referee even if the evidence bar is low. I would send it to review with a request for some form of validation or explicit discussion of generalization limits.

Referee Report

2 major / 2 minor

Summary. The paper examines historical failures of artificial intelligence (AI) and proposes a classification scheme for categorizing future failures. By doing so the authors hope that (a) the responses to future failures can be improved through applying a systematic classification that can be used to simplify the choice of response and (b) future failures can be reduced through augmenting development lifecycles with targeted risk assessments.

Significance. If the proposed classification schema can be shown to generalize, it would offer a structured framework for AI risk analysis that builds systematically on historical cases, potentially aiding standardization in safety practices. The manuscript's strength is its literature-based construction of categories from documented incidents, providing a clear starting point for further work even without validation data.

major comments (2)

[Abstract] Abstract: the central claims that the classification 'can be used to simplify the choice of response' and will 'reduce future failures' through targeted risk assessments are asserted without any validation data, controlled comparison, case study application, or quantitative assessment of utility.
[Classification schema presentation] The construction of the schema (detailed in the section presenting the classification) relies exclusively on pre-2019 historical cases and supplies no mechanism or test for handling distribution shift, emergent behaviors in large-scale models, or new deployment contexts, which directly undermines the generalization required for claim (b).

minor comments (2)

[Throughout] Notation for category definitions could be made more consistent across the text to aid readability.
[Introduction] Additional references to related taxonomies in AI safety literature would strengthen the positioning of the proposal.

Simulated Author's Rebuttal

2 responses · 0 unresolved

We thank the referee for the constructive feedback. We address each major comment below and indicate planned revisions to the manuscript.

read point-by-point responses

Referee: [Abstract] Abstract: the central claims that the classification 'can be used to simplify the choice of response' and will 'reduce future failures' through targeted risk assessments are asserted without any validation data, controlled comparison, case study application, or quantitative assessment of utility.

Authors: We agree that the abstract presents these benefits as direct outcomes without supporting validation. The manuscript is exploratory and derives the schema from historical cases to suggest logical applications rather than demonstrate them empirically. We will revise the abstract to replace assertive phrasing with tentative language (e.g., 'we propose that the classification may help' and 'could support targeted risk assessments') and add a clause noting that empirical evaluation of utility remains future work. revision: yes
Referee: [Classification schema presentation] The construction of the schema (detailed in the section presenting the classification) relies exclusively on pre-2019 historical cases and supplies no mechanism or test for handling distribution shift, emergent behaviors in large-scale models, or new deployment contexts, which directly undermines the generalization required for claim (b).

Authors: The schema is constructed from pre-2019 cases because the paper's scope is a literature-based analysis of documented incidents to derive categories. No explicit mechanism for distribution shift is supplied, as the work focuses on establishing an initial taxonomy rather than a dynamic updating procedure. The categories are defined at a level of abstraction (root cause and impact types) intended to remain applicable across contexts, but we acknowledge this does not constitute a test for emergent behaviors. We will add a dedicated limitations paragraph in the discussion section addressing generalization and suggesting extension protocols for future cases. revision: partial

Circularity Check

0 steps flagged

No circularity: classification derived from external historical cases via literature review

full rationale

The paper constructs its classification schema by examining historical AI failures drawn from external sources and applying logical categorization. No equations, fitted parameters, self-citations that bear the central claim, or derivations appear. The proposal does not reduce any result to inputs defined by the authors' prior work; it is a descriptive taxonomy whose utility for future cases is presented as an open empirical question rather than a self-contained derivation.

Axiom & Free-Parameter Ledger

0 free parameters · 2 axioms · 0 invented entities

The proposal rests on the domain assumption that historical failures supply transferable categories and that classification itself improves response choice and risk reduction; no free parameters or invented physical entities are introduced.

axioms (2)

domain assumption Historical AI failures can be grouped into categories that generalize to future failures.
Invoked in the abstract when the authors state that examining historical failures will improve responses to future ones.
domain assumption A systematic classification simplifies choice of response and enables targeted risk assessments.
Central to the hoped-for benefits listed in the abstract.

pith-pipeline@v0.9.0 · 5578 in / 1196 out tokens · 17593 ms · 2026-05-24T21:08:18.734507+00:00 · methodology

discussion (0)

Lean theorems connected to this paper

Citations machine-checked in the Pith Canon. Every link opens the source theorem in the public Lean library.

IndisputableMonolith/Foundation/RealityFromDistinction.lean reality_from_one_distinction unclear

?

unclear
Relation between the paper passage and the cited Recognition theorem.

We address each of these steps in proposing the following dimensions as useful classification criteria for AI failures: Consequences (phenomenology), Agency (etiology), Preventability (ontology), Stage of introduction in the product lifecycle
IndisputableMonolith/Cost/FunctionalEquation.lean washburn_uniqueness_aczel unclear

?

unclear
Relation between the paper passage and the cited Recognition theorem.

Neumann (Neumann, Computer-Related Risks, 1994) described a classification for computer risk factors... We find this list too broad... We modify and extend earlier work by Yampolskiy (2016)

What do these tags mean?

matches: The paper's claim is directly supported by a theorem in the formal canon.
supports: The theorem supports part of the paper's argument, but the paper may add assumptions or extra steps.
extends: The paper goes beyond the formal theorem; the theorem is a base layer rather than the whole result.
uses: The paper appears to rely on the theorem as machinery.
contradicts: The paper's claim conflicts with a theorem or certificate in the canon.
unclear: Pith found a possible connection, but the passage is too broad, indirect, or ambiguous to say the theorem truly supports the claim.

Reference graph

Works this paper leans on

19 extracted references · 19 canonical work pages

[1]

It is estimated to create 133 million new roles by 2022 but to displace 75 million jobs in the same period [6]

and employ 22,000 PhD researchers [2]. It is estimated to create 133 million new roles by 2022 but to displace 75 million jobs in the same period [6]. Projections for the eventual impact of AI on humanity range from utopia (Kurzweil,

work page 2022
[2]

In many respects AI development outpaces the efforts of prognosticators to predict its progress and is inherently unpredictable (Yampolskiy, 2019)

(p.487) to extinction (Bostrom, 2005). In many respects AI development outpaces the efforts of prognosticators to predict its progress and is inherently unpredictable (Yampolskiy, 2019). Yet all AI development is (so far) undertaken by humans, and the field of software development is noteworthy for unreliability of delivering on promises: over two-thirds ...

work page 2005
[3]

Analysis of the approach of confining a superintelligence has concluded this would be difficult (Yampolskiy,

that researchers have taken to metaanalysis of the predictions through correlation against metrics such as coding experience of the predictors [5].Less contentious is the assertion that the development of AGI will inevitably lead to the development of ASI: artificial superintelligence, an AI many times more intelligent than the smartest human, if only by ...

work page 1993
[4]

if not impossible (Yudkowsky, 2002). Many of the problems presented by a superintelligence resemble exercises in international diplomacy more than computer software challenges; for instance, the value alignment problem (Bostrom,

work page 2002
[5]

peak of inflated expectations

many fewer vendors are willing to identify their products as AI than during the current period of myriad AI technologies clogging the “peak of inflated expectations” in the Gartner Hype Cycle. [6]Failure is defined as “the nonperformance or inability of the system or component to perform its expected function for a specified time under specified environme...

work page 1995
[6]

fake news

described a classification for computer risk factors (see table 1).Problem sources and examplesRequirements definition, omissions, mistakesSystem design, flawsHardware implementation, wiring, chip flawsSoftware implementation, program bugs, compiler bugsSystem use and operation, inadvertent mistakesWilful system misuseHardware, communication, or other equ...

work page 1989
[7]

A superintelligence might be highly resistant to decommissioning

(p.154).3.2.4 A common taxonomy for computer system errors is the software development lifecycle stage (see table 5); it is often asserted that the cost of fixing an error at each stage is ten times the cost of fixing it in the previous stage (Dawson, Burrell, Rahim, & Brewster, 2010).Lifecycle StageCodeConceptLCDesignLDDevelopmentLETestingLTOperationLODe...

work page 2010
[8]

(CIP, AN, PT, LD). But these and other more fatal accidents with industrial robots going back at least to 1984 when an operator was killed by a 2,500 lb robot that came behind him with no warning (Fuller,

work page 1984
[9]

Flash Crash

(CIP, CCF, AA, PS, LD).AI accidents may result in direct financial loss. The May 2010 “Flash Crash” resulted in the Dow Jones Industrial Average dropping about 9% for 36 minutes and resulted from program trading algorithms being inadequately prepared to deal with large volumes of strategically-placed trades which themselves were computer-mediated malice

work page 2010
[10]

Remediation efforts did not prevent more flash crashes in 2015 [17].A major concern in the application of AI is privacy

(CIF, CCF, AA, AM, PD, LD, LT). Remediation efforts did not prevent more flash crashes in 2015 [17].A major concern in the application of AI is privacy. Consumer devices connected to corporate clouds of identity data come under scrutiny, especially when, for instance, an Amazon Alexa node recorded a private conversation and sent it to a random contact

work page 2015
[11]

Subject’s eyes are closed

(CIE, CIF, CYC, AA, PD, LE). Just as human bias often results from inadequate exposure to diversity, AI bias often arises from the same cause. An attempt to use AI to objectively judge an online international beauty contest without human bias failed when only one of 44 winners it chose had dark skin, prompting speculation that this was due to the training...

work page 2017
[12]

social credit

(CYS, AI, PT, LC). While this software is being used to create exactly its intended effect, we label this a failure because it has consequences many western observers would consider to be socially harmful. China has a “social credit” scoring system reminiscent of a Black Mirror episode (Wright, 2016), linked to social media and consumer systems such as Se...

work page 2016
[13]

and the attention paid by children in class [40], with the most attentive being rewarded (CIM, CYC, AI, PT, LC).In the West the dangers are more nascent. Researchers at the University of Pennsylvania demonstrated that textual analysis of an individual’s Facebook posts could predict 21 different medical conditions such as diabetes (Merchant, Asch, Crutchle...

work page 2019
[14]

An AI designed to do X will eventually fail to do X,

[47]. Companies exploit human psychology to get our attention [48], the US military studies how to influence Twitter users [49], and the Pentagon wants to predict protests against the US President via social media surveillance [50].As Yampolskiy (2016) pointed out, “An AI designed to do X will eventually fail to do X,” codified as the Fundamental Theorem ...

work page 2016
[15]

Call me an ambulance

the parallels with HAL 9000 of 2001: A Space Odyssey were so irresistible as to obscure the real risks of a computer failure in a critical environment. Apple’s Siri’s initial response to the request “Call me an ambulance” was to refer to the user thereafter as “ambulance”

work page 2001
[16]

properly

that they check every category of failure classification, suggesting a path towards unbounded risk. They can exploit misfeatures or bugs in their environment, such as when in the developmental stages of the NERO video game, players’ robots evolved a wiggling motion that allowed them to walk up walls rather than solve the obstacles “properly” by walking a...

work page 2005
[17]

With Folded Hands

(CIP AM, PD, LE).Most shows that explore AI failure develop a theme epitomized by Terminator series: a massive AI becomes self-aware and attempts to destroy humanity. (CIP, CIE, CIF, CCF, CYF, CYS, CYC, AN, AI, PD, LC, LO). Variations include Colossus: The Forbin Project, where the AI imprisons humanity to end conflict (CIM, CIE, CYS, CYC, AN, AI, PD, LC,...

work page 1947
[18]

Differential privacy

and Maas’ application to AI: “At their extreme, unexpected interactions between competing systems, especially in cyberspace, could cause unexpected escalation—a ‘flash war’, analogous to the algorithmic flash crashes observed in the financial sector.” (Maas, 2018)5. ResponsesThere are various responses to these failures and risks. Several address privacy....

work page 2018
[19]

The surprising creativity of digital evolution: A col- lection of anecdotes from the evolutionary computation and artiﬁcial life research communities

ConclusionsWhile we have not made recommendations as to how to address AI failures in each category of the dimensions we have presented, we hope that this classification scheme will make the development of remediation approaches easier.The importance of this effort may be extrapolated from Leveson’s observation that “The design of the automated system may...

work page arXiv 1995

[1] [1]

It is estimated to create 133 million new roles by 2022 but to displace 75 million jobs in the same period [6]

and employ 22,000 PhD researchers [2]. It is estimated to create 133 million new roles by 2022 but to displace 75 million jobs in the same period [6]. Projections for the eventual impact of AI on humanity range from utopia (Kurzweil,

work page 2022

[2] [2]

In many respects AI development outpaces the efforts of prognosticators to predict its progress and is inherently unpredictable (Yampolskiy, 2019)

(p.487) to extinction (Bostrom, 2005). In many respects AI development outpaces the efforts of prognosticators to predict its progress and is inherently unpredictable (Yampolskiy, 2019). Yet all AI development is (so far) undertaken by humans, and the field of software development is noteworthy for unreliability of delivering on promises: over two-thirds ...

work page 2005

[3] [3]

Analysis of the approach of confining a superintelligence has concluded this would be difficult (Yampolskiy,

that researchers have taken to metaanalysis of the predictions through correlation against metrics such as coding experience of the predictors [5].Less contentious is the assertion that the development of AGI will inevitably lead to the development of ASI: artificial superintelligence, an AI many times more intelligent than the smartest human, if only by ...

work page 1993

[4] [4]

if not impossible (Yudkowsky, 2002). Many of the problems presented by a superintelligence resemble exercises in international diplomacy more than computer software challenges; for instance, the value alignment problem (Bostrom,

work page 2002

[5] [5]

peak of inflated expectations

many fewer vendors are willing to identify their products as AI than during the current period of myriad AI technologies clogging the “peak of inflated expectations” in the Gartner Hype Cycle. [6]Failure is defined as “the nonperformance or inability of the system or component to perform its expected function for a specified time under specified environme...

work page 1995

[6] [6]

fake news

described a classification for computer risk factors (see table 1).Problem sources and examplesRequirements definition, omissions, mistakesSystem design, flawsHardware implementation, wiring, chip flawsSoftware implementation, program bugs, compiler bugsSystem use and operation, inadvertent mistakesWilful system misuseHardware, communication, or other equ...

work page 1989

[7] [7]

A superintelligence might be highly resistant to decommissioning

(p.154).3.2.4 A common taxonomy for computer system errors is the software development lifecycle stage (see table 5); it is often asserted that the cost of fixing an error at each stage is ten times the cost of fixing it in the previous stage (Dawson, Burrell, Rahim, & Brewster, 2010).Lifecycle StageCodeConceptLCDesignLDDevelopmentLETestingLTOperationLODe...

work page 2010

[8] [8]

(CIP, AN, PT, LD). But these and other more fatal accidents with industrial robots going back at least to 1984 when an operator was killed by a 2,500 lb robot that came behind him with no warning (Fuller,

work page 1984

[9] [9]

Flash Crash

(CIP, CCF, AA, PS, LD).AI accidents may result in direct financial loss. The May 2010 “Flash Crash” resulted in the Dow Jones Industrial Average dropping about 9% for 36 minutes and resulted from program trading algorithms being inadequately prepared to deal with large volumes of strategically-placed trades which themselves were computer-mediated malice

work page 2010

[10] [10]

Remediation efforts did not prevent more flash crashes in 2015 [17].A major concern in the application of AI is privacy

(CIF, CCF, AA, AM, PD, LD, LT). Remediation efforts did not prevent more flash crashes in 2015 [17].A major concern in the application of AI is privacy. Consumer devices connected to corporate clouds of identity data come under scrutiny, especially when, for instance, an Amazon Alexa node recorded a private conversation and sent it to a random contact

work page 2015

[11] [11]

Subject’s eyes are closed

(CIE, CIF, CYC, AA, PD, LE). Just as human bias often results from inadequate exposure to diversity, AI bias often arises from the same cause. An attempt to use AI to objectively judge an online international beauty contest without human bias failed when only one of 44 winners it chose had dark skin, prompting speculation that this was due to the training...

work page 2017

[12] [12]

social credit

(CYS, AI, PT, LC). While this software is being used to create exactly its intended effect, we label this a failure because it has consequences many western observers would consider to be socially harmful. China has a “social credit” scoring system reminiscent of a Black Mirror episode (Wright, 2016), linked to social media and consumer systems such as Se...

work page 2016

[13] [13]

and the attention paid by children in class [40], with the most attentive being rewarded (CIM, CYC, AI, PT, LC).In the West the dangers are more nascent. Researchers at the University of Pennsylvania demonstrated that textual analysis of an individual’s Facebook posts could predict 21 different medical conditions such as diabetes (Merchant, Asch, Crutchle...

work page 2019

[14] [14]

An AI designed to do X will eventually fail to do X,

[47]. Companies exploit human psychology to get our attention [48], the US military studies how to influence Twitter users [49], and the Pentagon wants to predict protests against the US President via social media surveillance [50].As Yampolskiy (2016) pointed out, “An AI designed to do X will eventually fail to do X,” codified as the Fundamental Theorem ...

work page 2016

[15] [15]

Call me an ambulance

the parallels with HAL 9000 of 2001: A Space Odyssey were so irresistible as to obscure the real risks of a computer failure in a critical environment. Apple’s Siri’s initial response to the request “Call me an ambulance” was to refer to the user thereafter as “ambulance”

work page 2001

[16] [16]

properly

that they check every category of failure classification, suggesting a path towards unbounded risk. They can exploit misfeatures or bugs in their environment, such as when in the developmental stages of the NERO video game, players’ robots evolved a wiggling motion that allowed them to walk up walls rather than solve the obstacles “properly” by walking a...

work page 2005

[17] [17]

With Folded Hands

(CIP AM, PD, LE).Most shows that explore AI failure develop a theme epitomized by Terminator series: a massive AI becomes self-aware and attempts to destroy humanity. (CIP, CIE, CIF, CCF, CYF, CYS, CYC, AN, AI, PD, LC, LO). Variations include Colossus: The Forbin Project, where the AI imprisons humanity to end conflict (CIM, CIE, CYS, CYC, AN, AI, PD, LC,...

work page 1947

[18] [18]

Differential privacy

and Maas’ application to AI: “At their extreme, unexpected interactions between competing systems, especially in cyberspace, could cause unexpected escalation—a ‘flash war’, analogous to the algorithmic flash crashes observed in the financial sector.” (Maas, 2018)5. ResponsesThere are various responses to these failures and risks. Several address privacy....

work page 2018

[19] [19]

The surprising creativity of digital evolution: A col- lection of anecdotes from the evolutionary computation and artiﬁcial life research communities

ConclusionsWhile we have not made recommendations as to how to address AI failures in each category of the dimensions we have presented, we hope that this classification scheme will make the development of remediation approaches easier.The importance of this effort may be extrapolated from Leveson’s observation that “The design of the automated system may...

work page arXiv 1995