pith. sign in

arxiv: 1906.10019 · v4 · submitted 2019-06-24 · 📊 stat.ML · cs.LG

Machine Learning Construction: implications to cybersecurity

Pith reviewed 2026-05-25 16:54 UTC · model grok-4.3

classification 📊 stat.ML cs.LG
keywords machine learning constructioncybersecuritystatistical learningdetection algorithmsassessmentcyberphysical securitythreat intelligence
0
0 comments X

The pith

Decomposing machine learning into construction and assessment advances its applications in cybersecurity.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

The paper establishes that machine learning decomposes into two subfields: construction, which is the design of algorithms that learn from data, and assessment, which measures their performance. This framework is applied to cyberphysical security to create detection algorithms that learn from security data for threat hunting and incident response. A reader would care because it provides a structured approach to developing ML tools that handle the complexity of modern threats. The review connects this to broader fields like probability and optimization.

Core claim

The central claim is that machine learning consists of construction—the invention of algorithms that learn input-output relationships from limited observations—and assessment—the attribution of performance measures to those algorithms. This decomposition serves as a useful framework for designing detection algorithms in cyberphysical security that can hunt threats and remediate incidents.

What carries the argument

The decomposition of machine learning into construction (designing the learning algorithm) and assessment (measuring performance).

If this is right

  • Design of detection algorithms capable of learning from security data.
  • Better monitoring of security incidents.
  • Mastery of the complexity of threat intelligence feeds.
  • Timely remediation of security incidents.

Where Pith is reading between the lines

These are editorial extensions of the paper, not claims the author makes directly.

  • This split could organize ML development across other domains that rely on data-driven detection.
  • Focusing on construction might encourage creation of new algorithms rather than repeated tuning of old ones.
  • Links to optimization and matrix theory could support more systematic security model building.

Load-bearing premise

The decomposition of machine learning into construction and assessment provides a useful and primary framework for advancing applications in cyberphysical security.

What would settle it

A demonstration that security detection systems perform no better when their development explicitly separates algorithm design from performance evaluation than when they do not.

Figures

Figures reproduced from arXiv: 1906.10019 by Waleed A. Yousef.

Figure 1
Figure 1. Figure 1: Schematic diagram for a single hidden layer neural network. [PITH_FULL_IMAGE:figures/full_fig_p008_1.png] view at source ↗
Figure 2
Figure 2. Figure 2: Sigmoid function under different learning rate [PITH_FULL_IMAGE:figures/full_fig_p009_2.png] view at source ↗
Figure 3
Figure 3. Figure 3: The probability of log-likelihood ratio conditional under each class. The two components of error are indicated as the FPF and FNF, the conven [PITH_FULL_IMAGE:figures/full_fig_p011_3.png] view at source ↗
Figure 4
Figure 4. Figure 4: ROC curves for two different classifiers. ROC [PITH_FULL_IMAGE:figures/full_fig_p012_4.png] view at source ↗
read the original abstract

Statistical learning is the process of estimating an unknown probabilistic input-output relationship of a system using a limited number of observations. A statistical learning machine (SLM) is the algorithm, function, model, or rule, that learns such a process; and machine learning (ML) is the conventional name of this field. ML and its applications are ubiquitous in the modern world. Systems such as Automatic target recognition (ATR) in military applications, computer aided diagnosis (CAD) in medical imaging, DNA microarrays in genomics, optical character recognition (OCR), speech recognition (SR), spam email filtering, stock market prediction, etc., are few examples and applications for ML; diverse fields but one theory. In particular, ML has gained a lot of attention in the field of cyberphysical security, especially in the last decade. It is of great importance to this field to design detection algorithms that have the capability of learning from security data to be able to hunt threats, achieve better monitoring, master the complexity of the threat intelligence feeds, and achieve timely remediation of security incidents. The field of ML can be decomposed into two basic subfields: \textit{construction} and \textit{assessment}. We mean by \textit{construction} designing or inventing an appropriate algorithm that learns from the input data and achieves a good performance according to some optimality criterion. We mean by \textit{assessment} attributing some performance measures to the constructed ML algorithm, along with their estimators, to objectively assess this algorithm. \textit{Construction} and \textit{assessment} of a ML algorithm require familiarity with different other fields: probability, statistics, matrix theory, optimization, algorithms, and programming, among others.f

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Referee Report

2 major / 1 minor

Summary. The manuscript defines statistical learning and decomposes the field of machine learning into two subfields—construction (designing or inventing learning algorithms that achieve good performance according to an optimality criterion) and assessment (attributing performance measures and their estimators to constructed algorithms)—and states that this decomposition is of great importance for designing detection algorithms in cyberphysical security to hunt threats, monitor systems, handle threat intelligence, and remediate incidents. No equations, algorithms, examples, or empirical results are presented.

Significance. The definitional distinction between construction and assessment is standard in the ML literature and does not, on its own, constitute a novel contribution. If the decomposition were shown to yield concrete advances (e.g., a new construction method or assessment protocol tailored to security data), it could help organize research; however, the manuscript provides no such demonstration, leaving the claimed implications to cybersecurity unsupported.

major comments (2)
  1. [Abstract] Abstract: The claim that the construction-assessment decomposition is 'of great importance' to cyberphysical security is load-bearing for the paper's thesis, yet the manuscript supplies neither a worked cybersecurity example nor a derivation showing how the split produces a measurable improvement over existing ML pipelines for threat detection or remediation.
  2. [Abstract] Abstract (final paragraph): The assertion that construction and assessment 'require familiarity with' probability, statistics, matrix theory, optimization, algorithms, and programming is presented without any mapping of these fields onto the two subfields or any indication of how the decomposition itself reduces the required expertise or complexity in security applications.
minor comments (1)
  1. [Abstract] The abstract terminates with the fragment 'among others.f', which appears to be a typographical artifact.

Simulated Author's Rebuttal

2 responses · 0 unresolved

We thank the referee for the constructive comments on our manuscript. We address each major comment below and indicate planned revisions where appropriate.

read point-by-point responses
  1. Referee: [Abstract] Abstract: The claim that the construction-assessment decomposition is 'of great importance' to cyberphysical security is load-bearing for the paper's thesis, yet the manuscript supplies neither a worked cybersecurity example nor a derivation showing how the split produces a measurable improvement over existing ML pipelines for threat detection or remediation.

    Authors: The manuscript is a short conceptual note whose primary aim is to articulate the construction-assessment split and note its relevance to cybersecurity tasks such as threat detection and incident remediation. We agree that the absence of a concrete worked example leaves the claimed implications at a general level and that a derivation of measurable improvement is not supplied. The decomposition is offered as an organizing lens rather than a new algorithm or protocol. We will revise the manuscript to include a brief illustrative scenario showing how separating construction (algorithm design for security data) from assessment (performance estimation) can clarify workflow choices in a detection setting. revision: yes

  2. Referee: [Abstract] Abstract (final paragraph): The assertion that construction and assessment 'require familiarity with' probability, statistics, matrix theory, optimization, algorithms, and programming is presented without any mapping of these fields onto the two subfields or any indication of how the decomposition itself reduces the required expertise or complexity in security applications.

    Authors: The listed disciplines are the conventional prerequisites for machine learning work. The manuscript does not supply an explicit mapping or a claim that the split quantitatively reduces expertise demands. A natural reading is that construction draws principally on optimization, algorithms, and programming, while assessment draws principally on probability, statistics, and matrix theory; the separation may therefore allow more focused allocation of effort in security projects. We will revise the final paragraph to include a concise mapping of the listed fields to the two subfields. revision: yes

Circularity Check

0 steps flagged

No significant circularity identified

full rationale

The manuscript is purely expository and contains no derivations, equations, predictions, theorems, or empirical claims. It defines the terms 'construction' (designing a learning algorithm) and 'assessment' (measuring performance) by fiat in the abstract and introduction, then states their relevance to cybersecurity detection. Because no load-bearing step reduces to a fit, self-citation chain, or input by construction, the circularity score is 0. The decomposition is presented as a useful framework rather than a derived result.

Axiom & Free-Parameter Ledger

0 free parameters · 1 axioms · 0 invented entities

Abstract-only review prevents identification of any free parameters, axioms, or invented entities beyond the definitional statements in the text. No numerical fits, unproved lemmas, or new postulated entities are present.

axioms (1)
  • domain assumption Machine learning decomposes into construction (designing the learning algorithm) and assessment (measuring performance) as defined.
    Explicitly stated in the abstract with the phrases 'We mean by construction...' and 'We mean by assessment...'

pith-pipeline@v0.9.0 · 5829 in / 1129 out tokens · 27760 ms · 2026-05-25T16:54:17.722353+00:00 · methodology

discussion (0)

Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.

Reference graph

Works this paper leans on

36 extracted references · 36 canonical work pages

  1. [1]

    An introduction to multivariate statistical analysis

    Anderson, T .W ., 2003. An introduction to multivariate statistical analysis. 3rd ed., Wiley-Interscience, Hoboken, N.J

  2. [2]

    On the Relationship Between Neural Networks, Pattern Recognition and Intelligence

    Bezdek, J.C., 1992. On the Relationship Between Neural Networks, Pattern Recognition and Intelligence. The International Journal of Approximate Reasoning 6, 85–107

  3. [3]

    What is computational intelligence?, in: Zurada, J.M., Marks, R.J., Robinson, C.J

    Bezdek, J.C., 1994. What is computational intelligence?, in: Zurada, J.M., Marks, R.J., Robinson, C.J. (Eds.), Computational intelligence : imitating life. New York, pp. 1–12

  4. [4]

    Probability and measure

    Billingsley, P ., 1995. Probability and measure. 3rd ed., Wiley, New York

  5. [5]

    Neural networks for pattern recognition

    Bishop, C.M., 1995. Neural networks for pattern recognition. Clarendon Press; Oxford University Press., Oxford; New York

  6. [6]

    Linear statistical models : an applied approach

    Bowerman, B.L., O’Connell, R.T ., 1990. Linear statistical models : an applied approach. 2nd ed., PWS-Kent Pub. Co., Boston

  7. [7]

    The Use of the Area Under the {ROC} Curve in the Evaluation of Machine Learning algorithms

    Bradley, A.P ., 1997. The Use of the Area Under the {ROC} Curve in the Evaluation of Machine Learning algorithms. Pattern Recognition 30, 1145

  8. [8]

    Statistical inference

    Casella, G., Berger, R.L., 2002. Statistical inference. 2nd ed., Duxbury/Thomson Learning, Australia ; Pacific Grove, CA

  9. [9]

    Plane answers to complex questions : the theory of linear models

    Christensen, R., 2002. Plane answers to complex questions : the theory of linear models. 3rd ed., Springer, New York

  10. [10]

    Pattern classification

    Duda, R.O., Hart, P .E., Stork, D.G., 2001. Pattern classification. 2nd ed., Wiley, New York

  11. [11]

    The Efficiency of Logistic Regression Compared To Normal Discriminant Analysis

    Efron, B., 1975. The Efficiency of Logistic Regression Compared To Normal Discriminant Analysis. Journal of the American Statistical Association 70, 892–898

  12. [12]

    Estimating the Error Rate of a Prediction Rule: Improvement on Cross-Validation

    Efron, B., 1983. Estimating the Error Rate of a Prediction Rule: Improvement on Cross-Validation. Journal of the American Statistical Association 78, 316–331

  13. [13]

    Cross Validation and the Bootstrap: Estimating the Error Rate of a Prediction Rule

    Efron, B., Tibshirani, R., 1995. Cross Validation and the Bootstrap: Estimating the Error Rate of a Prediction Rule. Technical Report 176, Stanford University, Department of Statistics

  14. [14]

    Improvements on Cross-Validation: the .632+ Bootstrap Method

    Efron, B., Tibshirani, R., 1997. Improvements on Cross-Validation: the .632+ Bootstrap Method. Journal of the American Statistical Association 92, 548–560

  15. [15]

    Computational intelligence : an introduction

    Engelbrecht, A.P ., 2002. Computational intelligence : an introduction. J. Wiley \& Sons, Chichester, England ; Hoboken, N.J

  16. [16]

    Projection Pursuit Regression

    Friedman, J.H., Stuetzle, W ., 1981. Projection Pursuit Regression. Journal of the American Statistical Association 76, 817–823

  17. [17]

    Introduction to statistical pattern recognition

    Fukunaga, K., 1990. Introduction to statistical pattern recognition. 2nd ed., Academic Press, Boston

  18. [18]

    Theory and application of the linear model

    Graybill, F .A., 1976. Theory and application of the linear model. Duxbury Press, North Scituate, Mass

  19. [19]

    The Meaning and Use of the Area Under a Receiver Operating Characteristic ({ROC}) curve

    Hanley, J.A., McNeil, B.J., 1982. The Meaning and Use of the Area Under a Receiver Operating Characteristic ({ROC}) curve. Radiology 143, 29–36

  20. [20]

    Generalized additive models

    Hastie, T ., Tibshirani, R., 1990. Generalized additive models. 1st ed., Chapman and Hall, London ; New York

  21. [21]

    The elements of statistical learning : data mining, inference, and prediction

    Hastie, T ., Tibshirani, R., Friedman, J.H., 2001. The elements of statistical learning : data mining, inference, and prediction. Springer, New York. H\’{a}jek, J., \v{S}id\’{a}k, Z., Sen, P .K., 1999. Theory of rank tests. 2nd ed., Academic Press, San Diego, Calif

  22. [22]

    Improving Breast Cancer Diagnosis With Computer-Aided diagnosis

    Jiang, Y., Nishikawa, R.M., Schmidt, R.A., Metz, C.E., Giger, M.L., Doi, K., 1999. Improving Breast Cancer Diagnosis With Computer-Aided diagnosis. Academic Radiology 6, 22–33

  23. [23]

    Sliced Inverse Regression for Dimension Reduction

    Li, K.C., 1991. Sliced Inverse Regression for Dimension Reduction. Journal of the American Statistical Association 86, 316–327

  24. [24]

    On Estimating Regression

    Nadaraya, E.A., 1964. On Estimating Regression. Theory of Probability and Its Applications 9, 141–142

  25. [25]

    On Estimation of a Probability Density Function and Mode

    Parzen, E., 1962. On Estimation of a Probability Density Function and Mode. The Annals of Mathematical Statistics 33, 1065–1076

  26. [26]

    Linear models in statistics

    Rencher, A.C., 2000. Linear models in statistics. Wiley, New York

  27. [27]

    Pattern recognition and neural networks

    Ripley, B.D., 1996. Pattern recognition and neural networks. Cambridge University Press, Cambridge ; New York

  28. [28]

    Advances in computational intelligence : theory and practice

    Schwefel, H.P ., Wegener, I., Weinert, K., 2003. Advances in computational intelligence : theory and practice

  29. [29]

    An Asymptotic Equivalence of Choice of Model By Cross-Validation and Akaike’ s Criterion

    Stone, M., 1977. An Asymptotic Equivalence of Choice of Model By Cross-Validation and Akaike’ s Criterion. Journal of the Royal Statistical Society. Series B (Methodological) 39, 44–47

  30. [30]

    Indices of Discrimination Or Diagnostic Accuracy: Their {ROC}s and Implied Models

    Swets, J.A., 1986. Indices of Discrimination Or Diagnostic Accuracy: Their {ROC}s and Implied Models. Psychological Bulletin 99, 100–117

  31. [31]

    Smooth Regression Analysis

    Watson, E.S., 1964. Smooth Regression Analysis. Sankhy\={a}: The Indian Journal of Statistics Series A„ 359–372

  32. [32]

    Assessing Classifiers in Terms of the Partial Area Under the Roc curve

    Yousef, W .A., 2013. Assessing Classifiers in Terms of the Partial Area Under the Roc curve. Computational Statistics & Data Analysis 64, 51–70. URL: https://doi.org/10.1016/j.csda.2013.02.032, doi:10.1016/j.csda.2013.02.032

  33. [33]

    Comparison of Non-Parametric Methods for Assessing Classifier Performance in Terms of {ROC} Parameters, in: Applied Imagery Pattern Recognition Workshop, 2004

    Yousef, W .A., Wagner, R.F ., Loew, M.H., 2004. Comparison of Non-Parametric Methods for Assessing Classifier Performance in Terms of {ROC} Parameters, in: Applied Imagery Pattern Recognition Workshop, 2004. Proceedings. 33rd; IEEE Computer Society, pp. 190–195

  34. [34]

    Estimating the Uncertainty in the Estimated Mean Area Under the {ROC} Curve of a Classifier

    Yousef, W .A., Wagner, R.F ., Loew, M.H., 2005. Estimating the Uncertainty in the Estimated Mean Area Under the {ROC} Curve of a Classifier. Pattern Recognition Letters 26, 2600–2610

  35. [35]

    Assessing Classifiers From Two Independent Data Sets Using {ROC} Analysis: a Nonparametric Approach

    Yousef, W .A., Wagner, R.F ., Loew, M.H., 2006. Assessing Classifiers From Two Independent Data Sets Using {ROC} Analysis: a Nonparametric Approach. Pattern Analysis and Machine Intelligence, IEEE Transactions on 28, 1809–1817

  36. [36]

    Advances in Computational Intelligence and Learning : Methods and Applications

    Zimmermann, H.J., Tselentis, G., van Someren, M., Dounias, D., 2002. Advances in Computational Intelligence and Learning : Methods and Applications. 14