When Do Data-Driven Systems Exhibit the Capability to Infer?
Pith reviewed 2026-06-27 09:59 UTC · model grok-4.3
The pith
The capability to infer under the AI Act must be assessed across the entire data processing workflow rather than the model alone.
A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.
Core claim
The framework grades inference capability into levels motivated by statistical learning theory; when applied to credit scoring, it establishes that the full data processing workflow determines whether sufficient inference occurs to qualify as an AI system under the AI Act, and that the degree of human expert involvement during development can significantly alter the assigned level.
What carries the argument
A multi-level grading framework for inference capability that evaluates how much a workflow goes beyond fixed statistical models by incorporating data-driven adaptation or inference steps.
If this is right
- Credit scoring systems can exhibit inference capability even when built from statistical models if the workflow contains data-driven elements.
- Human expert involvement in development or feature engineering can reduce a system's assigned inference level.
- The entire pipeline, including preprocessing and postprocessing, must be examined rather than the model in isolation.
- Further regulatory guidance is needed for systems whose inference level falls in a gray area between the defined thresholds.
Where Pith is reading between the lines
- Developers could structure workflows to keep inference below the Act's threshold and thereby avoid high-risk obligations.
- The same grading approach could be tested on other Annex III domains such as employment screening or insurance pricing.
- Purely statistical systems might perform similar functions to AI systems yet fall outside the Act's scope if the workflow is designed accordingly.
Load-bearing premise
The levels of inference capability defined in the framework align with the legal interpretation of inference under the AI Act and Commission Guidelines.
What would settle it
A concrete EU regulatory decision or court ruling on whether a credit scoring workflow that uses only non-adaptive statistical models without data-driven elements in any stage counts as having inference capability.
Figures
read the original abstract
The European AI Act is the first comprehensive regulation of artificial intelligence (AI), setting out extensive obligations, particularly for so-called high-risk and general-purpose AI systems. A key distinguishing feature of AI systems under the AI Act is the capability to infer. Since the AI Act does not clearly define what inference is, there is a gray area for certain data-driven systems. A specific example is credit scoring systems, which are listed by Annex III of the AI Act. At the same time, however, these are often implemented using statistical models for which it is unclear whether they have the capability to infer and thus fall under the AI definition of the AI Act at all. Motivated by statistical learning theory, this work develops a framework for grading different levels of the capability to infer. Based on the AI Act and the Commission Guidelines on the definition of an artificial intelligence system, we analyze which levels constitute sufficient capability to infer within the meaning of the AI Act and where further regulatory clarity is needed. We illustrate the framework by creating two realistic credit scoring workflows and show whether and where inference occurs in them. Our analysis illustrates that not only individual models but the entire data processing workflow must be considered. It also shows that the involvement of human experts during development can have significant influence on the capability to infer. Code can be found at https://github.com/fraunhofer-iais/inference-framework-creditscorecards.
Editorial analysis
A structured set of objections, weighed in public.
Referee Report
Summary. The paper develops a framework, motivated by statistical learning theory, for grading levels of 'capability to infer' in data-driven systems. It analyzes this against the EU AI Act's definition of AI systems, particularly for credit scoring workflows listed in Annex III. Through two realistic credit scoring examples, it concludes that the entire data processing workflow must be considered, not just individual models, and that human expert involvement during development significantly influences whether the system exhibits inference capability. Code is provided for reproducibility.
Significance. If the proposed grading levels accurately capture the legal notion of inference under the AI Act, this work offers a practical tool for assessing regulatory obligations for statistical models in high-risk areas like credit scoring. It emphasizes workflow-level analysis and the role of human experts, which could inform both developers and regulators. The provision of concrete workflows and open code strengthens its utility for the community.
major comments (2)
- [§4] §4 (Mapping to the AI Act and Commission Guidelines): the claim that framework levels 3+ constitute sufficient 'capability to infer' under the AI Act rests on the authors' interpretive reading of the Act and Guidelines; no formal legal derivation, external expert validation, or cross-check against enforcement practice is provided, yet this mapping is load-bearing for the regulatory conclusions about credit scoring workflows.
- [§5.2] §5.2 (Credit Scoring Workflow 2): the analysis concludes that human expert involvement during feature engineering reduces inference capability below the AI Act threshold, but the grading criteria for 'human involvement' are not operationalized with measurable thresholds, making the workflow-level claim difficult to verify or replicate.
minor comments (2)
- [Abstract] The abstract states the motivation and high-level approach but provides no details on framework construction or grading criteria; expanding the abstract to include one sentence on the levels would improve accessibility.
- [Table 1] Table 1 (Framework Levels): the distinction between levels 2 and 3 uses terms from statistical learning theory without explicit cross-reference to the specific theorems or definitions invoked, which could be clarified with a short footnote.
Simulated Author's Rebuttal
Thank you for the opportunity to respond to the referee's comments. We appreciate the detailed feedback and address each major comment point by point below, indicating where revisions will be made to the manuscript.
read point-by-point responses
-
Referee: [§4] §4 (Mapping to the AI Act and Commission Guidelines): the claim that framework levels 3+ constitute sufficient 'capability to infer' under the AI Act rests on the authors' interpretive reading of the Act and Guidelines; no formal legal derivation, external expert validation, or cross-check against enforcement practice is provided, yet this mapping is load-bearing for the regulatory conclusions about credit scoring workflows.
Authors: We concur that the mapping from our framework levels to the AI Act's notion of inference capability is based on our interpretive reading of the Act and the accompanying Commission Guidelines. As the manuscript is primarily a technical contribution motivated by statistical learning theory, we did not include a formal legal derivation or seek external legal validation. We will revise the text in §4 to more clearly articulate that this mapping represents our technical interpretation informed by the regulatory documents, and to note the need for further regulatory clarity as already mentioned in the paper. We cannot, however, provide a formal legal analysis or cross-check against enforcement practice within the scope of this work. revision: partial
-
Referee: [§5.2] §5.2 (Credit Scoring Workflow 2): the analysis concludes that human expert involvement during feature engineering reduces inference capability below the AI Act threshold, but the grading criteria for 'human involvement' are not operationalized with measurable thresholds, making the workflow-level claim difficult to verify or replicate.
Authors: We agree that the criteria for assessing human involvement lack measurable thresholds, which limits the replicability of the analysis in §5.2. In the revised manuscript, we will operationalize these criteria by providing specific examples and qualitative thresholds for what constitutes 'significant' human involvement in feature engineering, drawing from the credit scoring workflow examples. This will include clearer distinctions between levels of involvement and how they affect the inference grading. revision: yes
- Lack of formal legal derivation, external expert validation, or cross-check against enforcement practice for the mapping in §4
Circularity Check
No circularity: framework and mapping rest on external legal texts and statistical learning theory
full rationale
The paper defines inference-capability levels from statistical learning theory and performs an interpretive mapping to the AI Act and Commission Guidelines on the definition of an AI system. Both sources are external to the paper. The central claims—that the full workflow must be assessed and that human involvement matters—are illustrated by constructing two credit-scoring examples; these examples do not reduce the framework levels or the legal mapping to fitted parameters or self-referential definitions. No equations, self-citations, or ansatzes are shown to create a definitional loop. The derivation chain is therefore self-contained against the cited external benchmarks.
Axiom & Free-Parameter Ledger
axioms (1)
- domain assumption The AI Act distinguishes AI systems by the capability to infer, as elaborated in the Commission Guidelines.
Reference graph
Works this paper leans on
-
[1]
Raymond Anderson. 2007.The Credit Scoring Toolkit: Theory and Practice for Retail Credit Risk Management and Decision Automation. Oxford University Press, Oxford, UK. doi:10.1093/oso/9780199226405.001.0001
-
[2]
2024.ACCIS Response to Consultation on European Commission’s Guidelines for an AI System Definition
Association of Consumer Credit Information Suppliers. 2024.ACCIS Response to Consultation on European Commission’s Guidelines for an AI System Definition. Retrieved January 31, 2026 from https://accis.eu/accis-response-to-consultation- on-european-commissions-guidelines-for-an-ai-system-definition
2024
-
[3]
Yoshua Bengio, Aaron Courville, and Pascal Vincent. 2013. Representation Learning: A Review and New Perspectives.IEEE Transactions on Pattern Analysis and Machine Intelligence35, 8 (Aug. 2013), 1798–1828. doi:10.1109/TPAMI.2013.50
-
[4]
Bernhard E. Boser, Isabelle M. Guyon, and Vladimir N. Vapnik. 1992. A Training Algorithm for Optimal Margin Classifiers. InProceedings of the Fifth Annual Workshop on Computational Learning Theory(Pittsburgh, Pennsylvania, USA) (COLT ’92). Association for Computing Machinery, New York, NY, USA, 144–152. doi:10.1145/130385.130401
-
[5]
Leo Breiman. 2001. Statistical Modeling: The Two Cultures (with Comments and a Rejoinder by the Author).Statist. Sci.16, 3 (2001), 199–231. doi:10.1214/ss/ 1009213726
work page doi:10.1214/ss/ 2001
-
[6]
Friedman, Richard A
Leo Breiman, Jerome H. Friedman, Richard A. Olshen, and Charles J. Stone. 1984. Classification and Regression Trees. Wadsworth, Belmont, CA, USA
1984
-
[7]
Michael Bücker, Gero Szepannek, Alicja Gosiewska, and Przemyslaw Biecek
-
[8]
Journal of the Operational Research Society , author =
Transparency, Auditability, and Explainability of Machine Learning Models in Credit Scoring.Journal of the Operational Research Society73, 1 (Jan. 2022), 70–90. doi:10.1080/01605682.2021.1922098
-
[9]
Heath Gauss, David K
Zoran Bursac, C. Heath Gauss, David K. Williams, and David W. Hosmer. 2008. Purposeful selection of variables in logistic regression.Source code for biology and medicine3 (2008), 17
2008
-
[10]
Carlotta Buttaboni and Luciano Floridi. 2026. A Regulatory Taxonomy of AI Opacity in the EU: Rethinking Transparency, Traceability, Interpretability, and Explainability.AI and Ethics6, Article 100 (2026). doi:10.1007/s43681-025-00940-0
-
[11]
George Casella and Roger L. Berger. 2002.Statistical Inference(2nd ed.). Duxbury, Pacific Grove, CA, USA
2002
-
[12]
Carlos T. Castán. 2024. The Legal Concept of Artificial Intelligence: The Debate Surrounding the Definition of AI System in the AI Act.BioLaw Journal - Rivista di BioDiritto(2024), 305–344. doi:10.15168/2284-4503-3000
-
[13]
Council of the European Union. 2024. Proposal for a Regulation of the European Parliament and of the Council Laying Down Harmonised Rules on Artificial Intelligence (Artificial Intelligence Act) and Amending Certain Union Legislative Acts. https://data.consilium.europa.eu/doc/document/ST-14954-2022-INIT/en/ pdf ST 14954/22
2024
-
[14]
Thomas M. Cover and Peter E. Hart. 1967. Nearest Neighbor Pattern Classifica- tion.IEEE Transactions on Information Theory13, 1 (1967), 21–27. doi:10.1109/ TIT.1967.1053964
arXiv 1967
-
[15]
Jesse Davis and Mark Goadrich. 2006. The Relationship Between Precision-Recall and ROC Curves. InProceedings of the 23rd International Conference on Machine Learning(Pittsburgh, Pennsylvania, USA)(ICML ’06). Association for Computing Machinery, New York, NY, USA, 233–240. doi:10.1145/1143844.1143874
-
[16]
Martin Ebers, Veronica R. S. Hoch, Frank Rosenkranz, Hannah Ruschemeier, and Björn Steinrötter. 2021. The European Commission’s Proposal for an Artificial Intelligence Act—A Critical Assessment by Members of the Robotics and AI Law Society (RAILS).J4, 4 (2021), 589–603. doi:10.3390/j4040043
-
[17]
Joshua Ellul. 2022. Should we regulate Artificial Intelligence or some uses of software?Discover Artificial Intelligence2 (2022). doi:10.1007/s44163-022-00021-9
-
[18]
European Commission. 2021. Proposal for a Regulation of the European Parliament and of the Council Laying Down Harmonised Rules on Artificial Intelligence (Artificial Intelligence Act) and Amending Certain Union Leg- islative Acts. https://eur-lex.europa.eu/legal-content/EN/TXT/?uri=CELEX: 52021PC0206 COM(2021) 206 final, CELEX:52021PC0206
2021
-
[19]
2024.Commission Launches Consultation on AI Act Prohibitions and AI System Definition
European Commission. 2024.Commission Launches Consultation on AI Act Prohibitions and AI System Definition. Retrieved January 31, 2026 from https://digital-strategy.ec.europa.eu/en/news/commission-launches- consultation-ai-act-prohibitions-and-ai-system-definition
2024
-
[20]
European Commission. 2025. Guidelines on the Definition of an Ar- tificial Intelligence System Established by Regulation (EU) 2024/1689 (AI Act). https://digital-strategy.ec.europa.eu/en/library/commission-publishes- 9 Poretschkin and Naeven guidelines-ai-system-definition-facilitate-first-ai-acts-rules-application Regula- tion (EU) 2024/1689
2025
-
[21]
David Fernández-Llorca, Emilia Gómez, Ignacio Sánchez, and Gabriele Mazz- ini. 2025. An Interdisciplinary Account of the Terminological Choices by EU Policymakers Ahead of the Final Agreement on the AI Act: AI System, General Purpose AI System, Foundation Model, and Generative AI.Artificial Intelligence and Law33, 4 (Dec. 2025), 875–888. doi:10.1007/s1050...
-
[22]
Steven Finlay. 2012.Credit Scoring, Response Modeling, and Insurance Rating: A Practical Guide to Forecasting Consumer Behavior(2nd ed.). Palgrave Macmillan, London, UK. doi:10.1057/9781137031693
-
[23]
Giusella Finocchiaro. 2024. The regulation of artificial intelligence.AI & SOCIETY 39 (2024), 1961–1968. doi:10.1007/s00146-023-01650-z
-
[24]
Luciano Floridi. 2023. On the Brussels-Washington Consensus About the Legal Definition of Artificial Intelligence.Philosophy & Technology36, 4, Article 87 (Dec. 2023). doi:10.1007/s13347-023-00690-z
-
[25]
Credit Fusion and Will Cukierski. 2011. Give Me Some Credit. https://kaggle. com/competitions/GiveMeSomeCredit. Kaggle
2011
-
[26]
Goodfellow, Dumitru Erhan, Pierre L
Ian J. Goodfellow, Dumitru Erhan, Pierre L. Carrier, Aaron Courville, Mehdi Mirza, Ben Hamner, Will Cukierski, Yichuan Tang, David Thaler, Dong-Hyun Lee, Yingbo Zhou, Chetan Ramaiah, Fangxiang Feng, Ruifan Li, Xiaojie Wang, Dimitris Athanasakis, John Shawe-Taylor, Maxim Milakov, John Park, Radu Ionescu, Marius Popescu, Cristian Grozea, James Bergstra, Jin...
-
[27]
Isabelle Guyon and André Elisseeff. 2003. An Introduction to Variable and Feature Selection.Journal of Machine Learning Research3 (March 2003), 1157–1182
2003
-
[28]
2024.Comments on the Final Trilogue Version of the AI Act
Philipp Hacker. 2024.Comments on the Final Trilogue Version of the AI Act. Retrieved January 31, 2026 from https://www.europeannewschool.eu/images/ chairs/hacker/Comments%20on%20the%20AI%20Act.pdf
2024
-
[29]
Philipp Hacker and Maximilian Eber. 2025. The Future of Credit Underwriting and Insurance Under the EU AI Act: Implications for Europe and Beyond.Harvard Data Science Review7, 3 (Aug. 2025). https://hdsr.mitpress.mit.edu/pub/19cwd6qx
2025
-
[30]
David J. Hand and William E. Henley. 1997. Statistical Classification Methods in Consumer Credit Scoring: A Review.Journal of the Royal Statistical Society Series A: Statistics in Society160, 3 (Sept. 1997), 523–541. doi:10.1111/j.1467- 985X.1997.00078.x
-
[31]
2009.The Elements of Statistical Learning: Data Mining, Inference, and Prediction(2 ed.)
Trevor Hastie, Robert Tibshirani, and Jerome Friedman. 2009.The Elements of Statistical Learning: Data Mining, Inference, and Prediction(2nd ed.). Springer, New York, NY, USA. doi:10.1007/978-0-387-84858-7
-
[32]
Hosmer and Stanley Lemeshow
David W. Hosmer and Stanley Lemeshow. 2000.Applied logistic regression. John Wiley and Sons
2000
-
[33]
Stefan Lessmann, Bart Baesens, Hsin-Vonn Seow, and Lyn C. Thomas. 2015. Benchmarking State-of-the-Art Classification Algorithms for Credit Scoring: An Update of Research.European Journal of Operational Research247, 1 (Nov. 2015), 124–136. doi:10.1016/j.ejor.2015.05.030
-
[34]
Mitchell
Tom M. Mitchell. 1997.Machine Learning. McGraw-Hill, New York, NY, USA
1997
-
[35]
Montagnani, Marie-Claire Najjar, and Antonio Davola
Maria L. Montagnani, Marie-Claire Najjar, and Antonio Davola. 2024. The EU Regulatory Approach(es) to AI Liability, and Its Application to the Financial Services Market.Computer Law & Security Review53, Article 105984 (2024). doi:10.1016/j.clsr.2024.105984
-
[36]
2024.Recommendation of the Council on Artificial Intelligence
OECD. 2024.Recommendation of the Council on Artificial Intelligence. Technical Report. Paris, France. https://oecd.ai/en/assets/files/OECD-LEGAL-0449-en.pdf OECD/LEGAL/0449
2024
-
[37]
Oxford English Dictionary. 2025. Inference. Retrieved January 31, 2026 from https://www.oed.com/dictionary/inference_n?tl=true
2025
-
[38]
Georgios Pavlidis. 2024. Unlocking the Black Box: Analysing the EU Artificial Intelligence Act’s Framework for Explainability in AI.Law, Innovation and Technology16, 1 (2024), 293–308. doi:10.1080/17579961.2024.2313795
-
[39]
Presno Linera and Anne Meuwese
Miguel Á. Presno Linera and Anne Meuwese. 2025. Regulating AI from Europe: A Joint Analysis of the AI Act and the Framework Convention on AI.The Theory and Practice of Legislation13, 3 (2025), 292–311. doi:10.1080/20508840.2025.2492524
-
[40]
John R. Quinlan. 1986. Induction of Decision Trees.Machine Learning1, 1 (March 1986), 81–106. doi:10.1007/BF00116251
-
[41]
2021.Artificial Intelligence, Global Edition A Modern Approach
Stuart Russell and Peter Norvig. 2021.Artificial Intelligence, Global Edition A Modern Approach. Pearson Deutschland. 1168 pages
2021
-
[42]
Jonas Schuett. 2023. Defining the Scope of AI Regulations.Law, Innovation and Technology15, 1 (2023), 60–82. doi:10.1080/17579961.2023.2184135
-
[43]
2006.Credit Risk Scorecards: Developing and Implementing Intelli- gent Credit Scoring
Naeem Siddiqi. 2006.Credit Risk Scorecards: Developing and Implementing Intelli- gent Credit Scoring. John Wiley & Sons, Inc., Hoboken, NJ, USA
2006
-
[44]
Sukhanjeet Singh, Andreas Schupbach, Antti Asiala, and Daniel A. Siwecki. 2025. AI’s Impact on Banking: Use Cases for Credit Scoring and Fraud Detection. Re- trieved January 31, 2026 from https://www.bankingsupervision.europa.eu/press/ supervisory-newsletters/newsletter/2025/html/ssm.nl251120_1.en.html ECB Supervision Newsletter
2025
-
[45]
Gerald Spindler. 2023. Algorithms, Credit Scoring, and the New Proposals of the EU for an AI Act and on a Consumer Credit Directive.Law and Financial Markets Review15, 3-4 (2023), 239–261. doi:10.1080/17521440.2023.2168940
-
[46]
Vapnik and Alexey Y
Vladimir N. Vapnik and Alexey Y. Chervonenkis. 1971. On the Uniform Con- vergence of Relative Frequencies of Events to Their Probabilities.Theory of Probability and its ApplicationsXVI, 2 (1971), 264–280
1971
-
[47]
Michael Veale and Frederik Zuiderveen Borgesius. 2021. Demystifying the Draft EU Artificial Intelligence Act – Analysing the Good, the Bad, and the Unclear Elements of the Proposed Approach.Computer Law Review International22, 4 (2021), 97–112. doi:10.9785/cri-2021-220402 A Definitions for scorecard development The credit scorecard development workflow as...
discussion (0)
Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.