Foundation Models for Credit Risk Prediction: A Game Changer?

Andreas Goethals; Bart Baesens; Christophe Mues; Cristi\'an Bravo; David Martens; Maria Oskarsd\'ottir; Seppe vanden Broucke; Simon De Vos; Stefan Lessmann; Tim Verdonck

arxiv: 2605.18147 · v1 · pith:CTSB7GATnew · submitted 2026-05-18 · 💻 cs.LG

Foundation Models for Credit Risk Prediction: A Game Changer?

Bart Baesens , Andreas Goethals , Stefan Lessmann , Simon De Vos , Cristi\'an Bravo , David Martens , Victor Medina-Olivares , Christophe Mues

show 4 more authors

Maria Oskarsd\'ottir Seppe vanden Broucke Tim Verdonck Wouter Verbeke

This is my paper

Pith reviewed 2026-05-20 13:14 UTC · model grok-4.3

classification 💻 cs.LG

keywords tabular foundation modelscredit riskprobability of defaultloss given defaultmachine learningbenchmarkingsmall datapretraining

0 comments

The pith

Tabular foundation models outperform standard machine learning techniques in credit risk prediction and deliver larger gains as the amount of training data shrinks.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

The paper tests recently introduced tabular foundation models on two central credit risk tasks: estimating the probability that a borrower defaults and the loss that would occur if default happens. These pretrained models are compared against gradient boosting, other advanced machine learning methods, and simpler baselines across multiple real-world datasets and different dataset sizes. The foundation models rank first overall and widen their lead precisely when data becomes scarce, conditions that match many practical lending portfolios. All comparisons use the models exactly as released, with no hyperparameter search or extra tuning. The pattern suggests that broad pretraining on unrelated tabular data can transfer useful structure to credit problems that suffer from limited labels and class imbalance.

Core claim

Tabular foundation models pretrained on large collections of out-of-domain tabular data achieve the highest predictive accuracy for both probability of default and loss given default across the tested datasets and tasks. Their advantage over gradient boosting and other competitors grows markedly as the number of training examples decreases, providing a direct response to the data scarcity, low default rates, and imbalance that have long complicated credit modeling.

What carries the argument

Tabular foundation models that carry knowledge acquired during pretraining on diverse, non-credit tabular datasets into the target tasks of default probability and loss estimation.

Load-bearing premise

Pretraining on data drawn from domains unrelated to lending still supplies useful patterns that improve performance on the structured and often imbalanced tables found in credit risk.

What would settle it

On a new set of small SME or corporate lending datasets, a carefully tuned gradient boosting model would match or exceed the foundation models on standard metrics such as AUC and Brier score.

Figures

Figures reproduced from arXiv: 2605.18147 by Andreas Goethals, Bart Baesens, Christophe Mues, Cristi\'an Bravo, David Martens, Maria Oskarsd\'ottir, Seppe vanden Broucke, Simon De Vos, Stefan Lessmann, Tim Verdonck, Victor Medina-Olivares, Wouter Verbeke.

**Figure 2.** Figure 2: reports the average performance of classification methods in terms of AUC, across the five folds of all PD datasets. A first observation is that TabICL, one of the foundation models considered here, achieves the best performance overall. Although the observed performance differences are small in absolute terms, it is notable that— without any training or hyperparameter optimization—a TFM outperforms widely… view at source ↗

**Figure 3.** Figure 3: Average [PITH_FULL_IMAGE:figures/full_fig_p012_3.png] view at source ↗

**Figure 4.** Figure 4: Probability of maximal AUC (PAMA) analysis for PD datasets. [PITH_FULL_IMAGE:figures/full_fig_p014_4.png] view at source ↗

**Figure 5.** Figure 5: Probability of maximal R 2 (PAMA) analysis for LGD datasets. 14 [PITH_FULL_IMAGE:figures/full_fig_p014_5.png] view at source ↗

**Figure 6.** Figure 6: Win/Loss ratio matrix for the PD benchmark (N = 14 datasets). Cell text gives the W/L count from the row method’s perspective; cell color encodes the net win rate (W − L)/N - blue: row method wins more often, red: row method loses more often. An asterisk (*) denotes a statistically significant difference (Holm-corrected Wilcoxon, p ≤ 0.05). 15 [PITH_FULL_IMAGE:figures/full_fig_p015_6.png] view at source ↗

**Figure 7.** Figure 7: Win/Loss ratio matrix for the LGD benchmark (N = 7 datasets). Cell text gives the W/L count from the row method’s perspective; cell color encodes the net win rate (W − L)/N - blue: row method wins more often, red: row method loses more often. An asterisk (*) denotes a statistically significant difference (Holm-corrected Wilcoxon, p ≤ 0.05). 16 [PITH_FULL_IMAGE:figures/full_fig_p016_7.png] view at source ↗

**Figure 8.** Figure 8: Spearman rank correlation between a method’s performance rank and PD dataset size (number of observa [PITH_FULL_IMAGE:figures/full_fig_p017_8.png] view at source ↗

**Figure 9.** Figure 9: Spearman rank correlation between a method’s performance rank and LGD dataset size (number of obser [PITH_FULL_IMAGE:figures/full_fig_p017_9.png] view at source ↗

**Figure 10.** Figure 10: PD learning curves: average AUC across PD datasets with more than 15,000 observations as the number [PITH_FULL_IMAGE:figures/full_fig_p018_10.png] view at source ↗

**Figure 11.** Figure 11: LGD learning curves: average R 2 across LGD datasets with more than 15,000 observations as the number of randomly sampled training observations increases from 500 to 15,000. Curves are shown for TabPFNv2, Linear Regression, and XGBoost. 18 [PITH_FULL_IMAGE:figures/full_fig_p018_11.png] view at source ↗

read the original abstract

Predictive models play a pivotal role in credit risk management, guiding critical decisions through accurate estimation of default probabilities and losses. Extensive research has introduced new modeling techniques, complemented by large-scale benchmarking studies consolidating the state-of-the-art. Today, quasi-standards such as gradient-boosting models paired with SHAP explainers have emerged, yet continuous improvement of risk models remains a top priority. Concurrently, rapid advancements in AI, most notably large language models, have disrupted predictive modeling paradigms. Foundation models, pretrained on extensive datasets from diverse domains, have demonstrated remarkable performance by leveraging prior knowledge. While prevalent in natural language processing and computer vision, foundation models for tabular data have only recently emerged. We conjecture that pretraining on out-of-domain data is particularly beneficial in small-data settings, such as SME lending or specialized corporate portfolios, and may help address longstanding challenges including low default portfolios and class imbalance. This paper benchmarks recently proposed tabular foundation models against a broad set of competitors, including established and advanced machine learning techniques, across two core tasks: PD and LGD modeling. Our evaluation encompasses various datasets, performance indicators, and experimental conditions. We find that tabular foundation models generally perform best across datasets and tasks. Moreover, they offer significant improvement in predictive performance as dataset size shrinks. These results are remarkable given that the models are tested out-of-the-box, without hyperparameter tuning, ensuring ease of use and mitigating computational costs.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Desk Editor's Note private letter to a colleague

Tabular foundation models beat baselines on credit PD/LGD tasks with larger gains on small datasets, but the paper needs tighter stats and dataset details to back the claim.

read the letter

The main point for you is that this paper benchmarks recent tabular foundation models on probability of default and loss given default tasks and reports they generally win, with the edge getting bigger as dataset size drops, all without any hyperparameter tuning. That small-data pattern is the freshest observation here and lines up with the practical pain points in SME lending or low-default portfolios. They run the comparison across several datasets and tasks, which is useful for seeing how these models hold up against gradient boosting and other standard approaches in a real domain. Testing out of the box is a plus because it keeps the results straightforward and avoids claims that only hold after heavy optimization. The write-up also flags class imbalance and data scarcity as ongoing issues that pretraining might help with, which feels grounded in the credit risk setting. The soft spots are mostly around transparency. The abstract gives no numbers on exact dataset sizes, how many runs they did, error bars, or the statistical tests used to declare wins, so it is hard to judge whether the reported gains are robust or sensitive to baseline choices. If the full paper supplies those controls and reproducible code, the results become more convincing; without them the improvements could partly reflect implementation differences. The conjecture about out-of-domain pretraining helping small samples is plausible but stays in the background rather than being isolated in an ablation. This work is aimed at credit risk practitioners and tabular ML researchers who want to see foundation models applied to imbalanced financial data. A reader looking for an applied benchmark rather than new theory will find it worth scanning. It deserves a serious referee because the empirical question is relevant and the small-data angle is worth checking, even if the paper will need clearer experimental reporting before publication.

Referee Report

3 major / 3 minor

Summary. The manuscript benchmarks recently proposed tabular foundation models against established and advanced machine learning techniques for two core credit-risk tasks: probability of default (PD) and loss given default (LGD) prediction. It reports that tabular foundation models achieve the highest performance across multiple datasets and tasks and deliver larger gains as training-set size decreases, even when applied out-of-the-box without hyperparameter tuning. The authors conjecture that out-of-domain pretraining is especially helpful in small-data regimes such as SME lending and for handling class imbalance and low-default portfolios.

Significance. If the empirical claims are substantiated with complete experimental details, the work could meaningfully influence credit-risk modeling practice by demonstrating that pretrained tabular models can outperform gradient-boosting baselines in data-scarce settings without additional tuning. The study also supplies a timely, broad comparison that consolidates the current state of tabular foundation models on financial tasks.

major comments (3)

[Section 4] Section 4 (Experimental Setup): the manuscript does not specify the exact datasets employed, their sizes, sources, or the precise train/validation/test splits used for each size-reduction experiment. Without these details it is impossible to reproduce the reported performance curves or to judge whether the claimed advantage in small-data regimes is robust.
[Section 5.1 and Table 3] Section 5.1 and Table 3: no statistical significance tests, confidence intervals, or error bars are reported for the performance differences between foundation models and baselines. The central claim that foundation models “generally perform best” therefore rests on point estimates whose variability cannot be assessed.
[Section 4.3] Section 4.3 (Handling of class imbalance): the paper states that class imbalance is a longstanding challenge yet provides no description of the loss functions, sampling strategies, or evaluation metrics (e.g., AUC-PR versus AUC-ROC) used to mitigate or measure its effect. This omission directly affects interpretation of the LGD and low-default results.

minor comments (3)

[Abstract and Section 4] The abstract claims results are “remarkable given that the models are tested out-of-the-box,” but the manuscript never states whether the competing gradient-boosting and neural-network baselines were also run without tuning or with default hyperparameters; this comparison should be clarified.
[Figure 2] Figure 2 (performance vs. dataset size) uses different y-axis scales across panels; consistent scaling or explicit annotation of the metric would improve readability.
[Section 2] A few citations to recent tabular foundation-model papers (e.g., TabPFN, TabTransformer variants) appear to be missing from the related-work section.

Simulated Author's Rebuttal

3 responses · 0 unresolved

We thank the referee for their constructive and detailed feedback, which highlights important aspects of reproducibility and statistical rigor. We address each major comment point by point below and outline the planned revisions.

read point-by-point responses

Referee: [Section 4] Section 4 (Experimental Setup): the manuscript does not specify the exact datasets employed, their sizes, sources, or the precise train/validation/test splits used for each size-reduction experiment. Without these details it is impossible to reproduce the reported performance curves or to judge whether the claimed advantage in small-data regimes is robust.

Authors: We agree that the current description lacks sufficient detail for full reproducibility. In the revised manuscript we will add a dedicated subsection or appendix table that lists every dataset by name, source (public repositories or anonymized financial sources), original sample size, feature count, and the exact train/validation/test split ratios. For the size-reduction experiments we will explicitly describe the subsampling procedure, including whether stratified sampling was used and the precise percentages or absolute sizes retained at each reduction level. revision: yes
Referee: [Section 5.1 and Table 3] Section 5.1 and Table 3: no statistical significance tests, confidence intervals, or error bars are reported for the performance differences between foundation models and baselines. The central claim that foundation models “generally perform best” therefore rests on point estimates whose variability cannot be assessed.

Authors: The referee is correct that variability measures would strengthen the claims. Although the experiments used fixed random seeds for reproducibility, we did not report results across multiple independent runs. In the revision we will recompute the main results in Table 3 and Section 5.1 over at least five random seeds, add error bars or standard deviations, and include paired statistical tests (e.g., Wilcoxon signed-rank) for the key comparisons between foundation models and the strongest baselines. revision: yes
Referee: [Section 4.3] Section 4.3 (Handling of class imbalance): the paper states that class imbalance is a longstanding challenge yet provides no description of the loss functions, sampling strategies, or evaluation metrics (e.g., AUC-PR versus AUC-ROC) used to mitigate or measure its effect. This omission directly affects interpretation of the LGD and low-default results.

Authors: We accept that the handling of class imbalance requires explicit description. In the revised Section 4.3 we will specify that for PD tasks we used class-weighted binary cross-entropy loss together with AUC-PR as the primary evaluation metric, while for LGD (a regression task) we report RMSE and MAE without additional sampling. We will also note any oversampling or threshold-tuning steps applied to low-default portfolios. revision: yes

Circularity Check

0 steps flagged

No significant circularity

full rationale

This is an empirical benchmarking study that reports performance of tabular foundation models versus baselines on PD and LGD tasks across datasets. The strongest claims rest on observed metrics from out-of-the-box evaluation rather than any derivation, fitted parameter renamed as prediction, or self-referential equation. No mathematical chain, ansatz, or uniqueness theorem is invoked; the conjecture about out-of-domain pretraining is presented as background motivation, not a load-bearing premise that reduces the reported results to inputs by construction. The paper is therefore self-contained against external benchmarks.

Axiom & Free-Parameter Ledger

0 free parameters · 1 axioms · 0 invented entities

The paper rests on the domain assumption that out-of-domain pretraining transfers usefully to tabular credit data in low-sample regimes; no free parameters or invented entities are described in the abstract.

axioms (1)

domain assumption Pretraining on out-of-domain data is particularly beneficial in small-data settings such as SME lending or specialized corporate portfolios
Explicitly stated as the central conjecture motivating the benchmark.

pith-pipeline@v0.9.0 · 5836 in / 1212 out tokens · 46054 ms · 2026-05-20T13:14:00.661732+00:00 · methodology

discussion (0)

Lean theorems connected to this paper

Citations machine-checked in the Pith Canon. Every link opens the source theorem in the public Lean library.

IndisputableMonolith/Foundation/RealityFromDistinction.lean reality_from_one_distinction unclear

?

unclear
Relation between the paper passage and the cited Recognition theorem.

We find that tabular foundation models generally perform best across datasets and tasks. Moreover, they offer significant improvement in predictive performance as dataset size shrinks.
IndisputableMonolith/Cost/FunctionalEquation.lean washburn_uniqueness_aczel unclear

?

unclear
Relation between the paper passage and the cited Recognition theorem.

TabPFN... amortizes Bayesian inference... zero-shot predictions

What do these tags mean?

matches: The paper's claim is directly supported by a theorem in the formal canon.
supports: The theorem supports part of the paper's argument, but the paper may add assumptions or extra steps.
extends: The paper goes beyond the formal theorem; the theorem is a base layer rather than the whole result.
uses: The paper appears to rely on the theorem as machinery.
contradicts: The paper's claim conflicts with a theorem or certificate in the canon.
unclear: Pith found a possible connection, but the passage is too broad, indirect, or ambiguous to say the theorem truly supports the claim.

Reference graph

Works this paper leans on

165 extracted references · 165 canonical work pages · 4 internal anchors

[1]

Foundation models: A new paradigm for artificial intelligence

Johannes Schneider, Christian Meske, and Pauline Kuss. Foundation models: A new paradigm for artificial intelligence. Business & Information Systems Engineering, 66 0 (2): 0 221--231, 2024

work page 2024
[3]

Learning transferable visual models from natural language supervision

Alec Radford, Jong Wook Kim, Chris Hallacy, Aditya Ramesh, Gabriel Goh, Sandhini Agarwal, Girish Sastry, Amanda Askell, Pamela Mishkin, Jack Clark, et al. Learning transferable visual models from natural language supervision. In International conference on machine learning, pages 8748--8763. PmLR, 2021

work page 2021
[4]

Foundation models defining a new era in vision: a survey and outlook

Muhammad Awais, Muzammal Naseer, Salman Khan, Rao Muhammad Anwer, Hisham Cholakkal, Mubarak Shah, Ming-Hsuan Yang, and Fahad Shahbaz Khan. Foundation models defining a new era in vision: a survey and outlook. IEEE Transactions on Pattern Analysis and Machine Intelligence, 2025

work page 2025
[5]

Benchmarking state-of-the-art classification algorithms for credit scoring: An update of research

Stefan Lessmann, Bart Baesens, Hsin-Vonn Seow, and Lyn C Thomas. Benchmarking state-of-the-art classification algorithms for credit scoring: An update of research. European journal of operational research, 247 0 (1): 0 124--136, 2015

work page 2015
[6]

Deep learning for credit scoring: Do or don’t? European Journal of Operational Research, 295 0 (1): 0 292--305, 2021

Bj \"o rn Rafn Gunnarsson, Seppe Vanden Broucke, Bart Baesens, Mar \' a \'O skarsd \'o ttir, and Wilfried Lemahieu. Deep learning for credit scoring: Do or don’t? European Journal of Operational Research, 295 0 (1): 0 292--305, 2021

work page 2021
[7]

Tabular data: Deep learning is not all you need

Ravid Shwartz-Ziv and Amitai Armon. Tabular data: Deep learning is not all you need. Information Fusion, 81: 0 84--90, 2022 a

work page 2022
[9]

u ller, Lennart Purucker, Arjun Krishnakumar, Max K \

Noah Hollmann, Samuel M \"u ller, Lennart Purucker, Arjun Krishnakumar, Max K \"o rfer, Shi Bin Hoo, Robin Tibor Schirrmeister, and Frank Hutter. Accurate predictions on small data with a tabular foundation model. Nature, 637 0 (8045): 0 319--326, 2025 a

work page 2025
[10]

The value of big data for credit scoring: Enhancing financial inclusion using mobile phone data and social network analytics

Mar \' a \'O skarsd \'o ttir, Cristi \'a n Bravo, Carlos Sarraute, Jan Vanthienen, and Bart Baesens. The value of big data for credit scoring: Enhancing financial inclusion using mobile phone data and social network analytics. Applied Soft Computing, 74: 0 26--39, 2019

work page 2019
[11]

Fairness in credit scoring: Assessment, implementation and profit implications

Nikita Kozodoi, Johannes Jacob, and Stefan Lessmann. Fairness in credit scoring: Assessment, implementation and profit implications. European Journal of Operational Research, 297 0 (3): 0 1083--1094, 2022

work page 2022
[12]

Algorithmic decision making methods for fair credit scoring

Darie Moldovan. Algorithmic decision making methods for fair credit scoring. IEEE Access, 11: 0 59729--59743, 2023

work page 2023
[13]

Credit risk analytics: Measurement techniques, applications, and examples in SAS

Bart Baesens, Daniel Roesch, and Harald Scheule. Credit risk analytics: Measurement techniques, applications, and examples in SAS. John Wiley & Sons, 2016

work page 2016
[14]

Benchmarking state-of-the-art classification algorithms for credit scoring

Bart Baesens, Tony Van Gestel, Stijn Viaene, Maria Stepanova, Johan Suykens, and Jan Vanthienen. Benchmarking state-of-the-art classification algorithms for credit scoring. Journal of the operational research society, 54 0 (6): 0 627--635, 2003

work page 2003
[15]

Benchmarking regression algorithms for loss given default modeling

Gert Loterman, Iain Brown, David Martens, Christophe Mues, and Bart Baesens. Benchmarking regression algorithms for loss given default modeling. International Journal of Forecasting, 28 0 (1): 0 161--170, 2012

work page 2012
[17]

An experimental comparison of classification algorithms for imbalanced credit scoring data sets

Iain Brown and Christophe Mues. An experimental comparison of classification algorithms for imbalanced credit scoring data sets. Expert systems with applications, 39 0 (3): 0 3446--3453, 2012

work page 2012
[18]

On the suitability of resampling techniques for the class imbalance problem in credit scoring

Ana Isabel Marqu \'e s, Vicente Garc \' a, and Jos \'e Salvador S \'a nchez. On the suitability of resampling techniques for the class imbalance problem in credit scoring. Journal of the Operational Research Society, 64 0 (7): 0 1060--1070, 2013

work page 2013
[21]

Reject inference, augmentation, and sample selection

John Banasik and Jonathan Crook. Reject inference, augmentation, and sample selection. European Journal of Operational Research, 183 0 (3): 0 1582--1594, 2007

work page 2007
[22]

Fighting sampling bias: A framework for training and evaluating credit scoring models

Nikita Kozodoi, Stefan Lessmann, Morteza Alamgir, Luis Moreira-Matias, and Konstantinos Papakonstantinou. Fighting sampling bias: A framework for training and evaluating credit scoring models. European Journal of Operational Research, 324 0 (2): 0 616--628, 2025

work page 2025
[23]

Loss given default models incorporating macroeconomic variables for credit cards

Tony Bellotti and Jonathan Crook. Loss given default models incorporating macroeconomic variables for credit cards. International Journal of Forecasting, 28 0 (1): 0 171--182, 2012

work page 2012
[24]

The devil in the details: Dynamic prediction of loan portfolio profitability with macroeconomic drivers through multi-state modelling

Viani B Djeundje, Jonathan Crook, and Galina Andreeva. The devil in the details: Dynamic prediction of loan portfolio profitability with macroeconomic drivers through multi-state modelling. European Journal of Operational Research, 2025

work page 2025
[27]

A survey of credit and behavioural scoring: forecasting financial risk of lending to consumers

Lyn C Thomas. A survey of credit and behavioural scoring: forecasting financial risk of lending to consumers. International journal of forecasting, 16 0 (2): 0 149--172, 2000

work page 2000
[28]

Linear and nonlinear credit scoring by combining logistic regression and support vector machines

Tony Van Gestel, Bart Baesens, Peter Van Dijcke, Johan Suykens, Joao Garcia, and Thomas Alderweireld. Linear and nonlinear credit scoring by combining logistic regression and support vector machines. Journal of credit Risk, 1 0 (4), 2005

work page 2005
[33]

P2p network lending, loss given default and credit risks

Guangyou Zhou, Yijia Zhang, and Sumei Luo. P2p network lending, loss given default and credit risks. Sustainability, 10 0 (4): 0 1010, 2018. ISSN 2071-1050

work page 2018
[46]

Credit scoring for profitability objectives

Steven Finlay. Credit scoring for profitability objectives. European Journal of Operational Research, 202 0 (2): 0 528–537, 2010

work page 2010
[66]

Shapley values as an interpretability technique in credit scoring

Hendrik Andries du Toit, Willem Dani \ A , Helgard Raubenheimer, et al. Shapley values as an interpretability technique in credit scoring. Journal of Risk Model Validation, 2023

work page 2023
[76]

The fairness of credit scoring models

Christophe Hurlin, Christophe Pérignon, and Sébastien Saurin. The fairness of credit scoring models. Management Science, 2025. doi:10.1287/mnsc.2022.03888

work page doi:10.1287/mnsc.2022.03888 2025
[80]

Why do tree-based models still outperform deep learning on typical tabular data? In Advances in Neural Information Processing Systems, 2022

Léo Grinsztajn, Edouard Oyallon, and Gaël Varoquaux. Why do tree-based models still outperform deep learning on typical tabular data? In Advances in Neural Information Processing Systems, 2022

work page 2022
[82]

Deep Learning in Banking: Integrating Artificial Intelligence for Next-Generation Financial Services

Cristian Bravo, Sebastian Maldonado, and Maria Oskarsdottir. Deep Learning in Banking: Integrating Artificial Intelligence for Next-Generation Financial Services. John Wiley & Sons, 2026

work page 2026
[84]

Revisiting deep learning models for tabular data

Yury Gorishniy, Ivan Rubachev, Valentin Khrulkov, and Artem Babenko. Revisiting deep learning models for tabular data. Advances in neural information processing systems, 34: 0 18932--18943, 2021

work page 2021
[89]

VIME: extending the success of self- and semi-supervised learning to tabular domain

Jinsung Yoon, Yao Zhang, James Jordon, and Mihaela van der Schaar. VIME: extending the success of self- and semi-supervised learning to tabular domain. In Hugo Larochelle, Marc'Aurelio Ranzato, Raia Hadsell, Maria - Florina Balcan, and Hsuan - Tien Lin, editors, Advances in Neural Information Processing Systems 33: Annual Conference on Neural Information ...

work page 2020
[90]

Subtab: Subsetting features of tabular data for self-supervised representation learning

Talip Ucar, Ehsan Hajiramezanali, and Lindsay Edwards. Subtab: Subsetting features of tabular data for self-supervised representation learning. In Marc'Aurelio Ranzato, Alina Beygelzimer, Yann N. Dauphin, Percy Liang, and Jennifer Wortman Vaughan, editors, Advances in Neural Information Processing Systems 34: Annual Conference on Neural Information Proces...

work page 2021
[92]

Stefan Hegselmann, Alejandro Buendia, Hunter Lang, Monica Agrawal, Xiaoyi Jiang, and David A. Sontag. Tabllm: Few-shot classification of tabular data with large language models. In Francisco J. R. Ruiz, Jennifer G. Dy, and Jan - Willem van de Meent, editors, International Conference on Artificial Intelligence and Statistics, 25-27 April 2023, Palau de Con...

work page 2023
[93]

Tabpfn: A transformer that solves small tabular classification problems in a second

Noah Hollmann, Samuel M \" u ller, Katharina Eggensperger, and Frank Hutter. Tabpfn: A transformer that solves small tabular classification problems in a second. In The Eleventh International Conference on Learning Representations, ICLR 2023, Kigali, Rwanda, May 1-5, 2023 . OpenReview.net, 2023. URL https://openreview.net/forum?id=cp5PvcI6w8\_

work page 2023
[94]

Transformers can do bayesian inference

Samuel M \" u ller, Noah Hollmann, Sebastian Pineda - Arango, Josif Grabocka, and Frank Hutter. Transformers can do bayesian inference. In The Tenth International Conference on Learning Representations, ICLR 2022, Virtual Event, April 25-29, 2022 . OpenReview.net, 2022. URL https://openreview.net/forum?id=KSugKcbNf9

work page 2022
[97]

TabPFN-2.5: Advancing the State of the Art in Tabular Foundation Models

Léo Grinsztajn, Klemens Flöge, Oscar Key, Felix Birkel, Philipp Jund, Brendan Roof, Benjamin Jäger, Dominik Safaric, Simone Alessi, Adrian Hayler, Mihir Manium, Rosen Yu, Felix Jablonski, Shi Bin Hoo, Anurag Garg, Jake Robertson, Magnus Bühler, Vladyslav Moroshan, Lennart Purucker, Clara Cornu, Lilly Charlotte Wehrhahn, Alessandro Bonetto, Bernhard Schölk...

work page internal anchor Pith review Pith/arXiv arXiv doi:10.48550/arxiv.2511.08667 2026
[99]

Thomas, David B

Lyn C. Thomas, David B. Edelman, and Jonathan N. Crook. Credit Scoring and its Applications. Siam, Philadelphia, 2002

work page 2002
[102]

Statistical comparisons of classifiers over multiple data sets

Janez Demšar. Statistical comparisons of classifiers over multiple data sets. Journal of Machine Learning Research, 7: 0 1–30, 2006

work page 2006
[103]

Individual comparisons by ranking methods

Frank Wilcoxon. Individual comparisons by ranking methods. Biometrics bulletin, 1 0 (6): 0 80--83, 1945

work page 1945
[104]

Do we need hundreds of classifiers to solve real world classification problems? Journal of Machine Learning Research, 15 0 (90): 0 3133--3181, 2014

Manuel Fern \'a ndez-Delgado, Eva Cernadas, Sen \'e n Barro, and Dinani Amorim. Do we need hundreds of classifiers to solve real world classification problems? Journal of Machine Learning Research, 15 0 (90): 0 3133--3181, 2014. URL http://jmlr.org/papers/v15/delgado14a.html

work page 2014
[105]

A comparison of alternative tests of significance for the problem of m rankings

Milton Friedman. A comparison of alternative tests of significance for the problem of m rankings. The annals of mathematical statistics, 11 0 (1): 0 86--92, 1940

work page 1940
[106]

Approximations of the critical region of the fbietkan statistic

Ronald L Iman and James M Davenport. Approximations of the critical region of the fbietkan statistic. Communications in Statistics-Theory and Methods, 9 0 (6): 0 571--595, 1980

work page 1980
[107]

A simple sequentially rejective multiple test procedure

Sture Holm. A simple sequentially rejective multiple test procedure. Scandinavian journal of statistics, pages 65--70, 1979

work page 1979
[108]

statistical comparisons of classifiers over multiple data sets

Salvador Garcia and Francisco Herrera. An extension on" statistical comparisons of classifiers over multiple data sets" for all pairwise comparisons. Journal of Machine Learning Research, 9 0 (12), 2008

work page 2008
[109]

2026 , publisher=

Deep Learning in Banking: Integrating Artificial Intelligence for Next-Generation Financial Services , author=. 2026 , publisher=

work page 2026
[110]

Applied Soft Computing , volume=

The value of big data for credit scoring: Enhancing financial inclusion using mobile phone data and social network analytics , author=. Applied Soft Computing , volume=. 2019 , publisher=

work page 2019
[111]

European Journal of Operational Research , volume=

Deep learning for credit scoring: Do or don’t? , author=. European Journal of Operational Research , volume=. 2021 , publisher=

work page 2021
[112]

IEEE Transactions on Pattern Analysis and Machine Intelligence , year=

Foundation models defining a new era in vision: a survey and outlook , author=. IEEE Transactions on Pattern Analysis and Machine Intelligence , year=

work page
[113]

Journal of the operational research society , volume=

Benchmarking state-of-the-art classification algorithms for credit scoring , author=. Journal of the operational research society , volume=. 2003 , publisher=

work page 2003
[114]

2016 , publisher=

Credit risk analytics: Measurement techniques, applications, and examples in SAS , author=. 2016 , publisher=

work page 2016
[115]

Do we Need Hundreds of Classifiers to Solve Real World Classification Problems? , journal =

Manuel Fern. Do we Need Hundreds of Classifiers to Solve Real World Classification Problems? , journal =. 2014 , volume =

work page 2014
[116]

Journal of Machine Learning Research , volume =

Demšar, Janez , title =. Journal of Machine Learning Research , volume =. 2006 , type =

work page 2006
[117]

TabICL: A Tabular Foundation Model for In-Context Learning on Large Data

Qu, Jingang and Holzmüller, David and Varoquaux, Gaël and Le Morvan, Marine , title =. arXiv preprint , volume =. doi:10.48550/arXiv.2502.05564 , year =

work page internal anchor Pith review doi:10.48550/arxiv.2502.05564
[118]

ArXiv preprint , volume =

Ye, Han-Jia and Liu, Si-Yang and Chao, Wei-Lun , title =. ArXiv preprint , volume =. doi:10.48550/arXiv.2502.17361 , year =

work page doi:10.48550/arxiv.2502.17361
[119]

Takuya Akiba and Shotaro Sano and Toshihiko Yanase and Takeru Ohta and Masanori Koyama , editor =. Optuna:. Proceedings of the 25th. 2019 , url =. doi:10.1145/3292500.3330701 , timestamp =

work page doi:10.1145/3292500.3330701 2019
[120]

European Journal of Operational Research , volume =

Zandi, Sahab and Korangi, Kamesh and Óskarsdóttir, María and Mues, Christophe and Bravo, Cristián , title =. European Journal of Operational Research , volume =. doi:10.1016/j.ejor.2024.09.025 , year =

work page doi:10.1016/j.ejor.2024.09.025 2024
[121]

Omega , volume =

Óskarsdóttir, María and Bravo, Cristián , title =. Omega , volume =. doi:10.1016/j.omega.2021.102520 , year =

work page doi:10.1016/j.omega.2021.102520 2021
[122]

European Journal of Operational Research , volume =

Calabrese, Raffaella and Crook, Jonathan , title =. European Journal of Operational Research , volume =. doi:10.1016/j.ejor.2020.04.031 , year =

work page doi:10.1016/j.ejor.2020.04.031 2020
[123]

European Journal of Operational Research , volume =

Medina-Olivares, Victor and Lindgren, Finn and Calabrese, Raffaella and Crook, Jonathan , title =. European Journal of Operational Research , volume =. doi:10.1016/j.ejor.2025.07.060 , year =

work page doi:10.1016/j.ejor.2025.07.060 2025
[124]

European Journal of Operational Research , volume =

Shi, Yong and Qu, Yi and Chen, Zhensong and Mi, Yunlong and Wang, Yunong , title =. European Journal of Operational Research , volume =. doi:10.1016/j.ejor.2023.12.028 , year =

work page doi:10.1016/j.ejor.2023.12.028 2023
[125]

European Journal of Operational Research , volume =

Li, Yibei and Wang, Ximei and Djehiche, Boualem and Hu, Xiaoming , title =. European Journal of Operational Research , volume =. doi:10.1016/j.ejor.2020.03.078 , year =

work page doi:10.1016/j.ejor.2020.03.078 2020
[126]

Journal of the Operational Research Society , pages =

De Cnudde, Sofie and Moeyersoms, Julie and Stankova, Marija and Tobback, Ellen and Javaly, Vinayak and Martens, David , title =. Journal of the Operational Research Society , pages =. doi:10.1080/01605682.2018.1434402 , year =

work page doi:10.1080/01605682.2018.1434402 2018
[127]

and Crook, Jonathan and Calabrese, Raffaella and Hamid, Mona , title =

Djeundje, Viani B. and Crook, Jonathan and Calabrese, Raffaella and Hamid, Mona , title =. Expert Systems with Applications , volume =. doi:10.1016/j.eswa.2020.113766 , year =

work page doi:10.1016/j.eswa.2020.113766 2020
[128]

Journal of Business and Economic Statistics , volume =

Dirick, Lore and Bellotti, Tony and Claeskens, Gerda and Baesens, Bart , title =. Journal of Business and Economic Statistics , volume =. doi:10.1080/07350015.2016.1260471 , year =

work page doi:10.1080/07350015.2016.1260471 2016
[129]

and De Caigny, Arno and Lessmann, Stefan , title =

Mena, Gary and Coussement, Kristof and De Bock, Koen W. and De Caigny, Arno and Lessmann, Stefan , title =. Annals of Operations Research , volume =. doi:10.1007/s10479-023-05259-9 , year =

work page doi:10.1007/s10479-023-05259-9
[130]

Advances in neural information processing systems , volume=

Revisiting deep learning models for tabular data , author=. Advances in neural information processing systems , volume=

work page
[131]

ArXiv preprint , volume =

Jiang, Jun-Peng and Liu, Si-Yang and Cai, Hao-Run and Zhou, Qile and Ye, Han-Jia , title =. ArXiv preprint , volume =. doi:10.48550/arXiv.2504.16109 , year =

work page doi:10.48550/arxiv.2504.16109
[132]

European Journal of Operational Research , volume =

Korangi, Kamesh and Mues, Christophe and Bravo, Cristián , title =. European Journal of Operational Research , volume =. doi:10.1016/j.ejor.2022.10.032 , year =

work page doi:10.1016/j.ejor.2022.10.032 2022
[133]

European Journal of Operational Research , volume =

Kriebel, Johannes and Stitz, Lennart , title =. European Journal of Operational Research , volume =. doi:10.1016/j.ejor.2021.12.024 , year =

work page doi:10.1016/j.ejor.2021.12.024 2021
[134]

European Journal of Operational Research , volume =

Stevenson, Matthew and Mues, Christophe and Bravo, Cristián , title =. European Journal of Operational Research , volume =. doi:10.1016/j.ejor.2021.03.008 , year =

work page doi:10.1016/j.ejor.2021.03.008 2021
[135]

European Journal of Operational Research , volume =

Wu, Zongxiao and Dong, Yizhe and Li, Yaoyiran and Shi, Baofeng , title =. European Journal of Operational Research , volume =. doi:10.1016/j.ejor.2025.04.032 , year =

work page doi:10.1016/j.ejor.2025.04.032 2025
[136]

Information Systems Research , ISSN =

Fu, Runshan and Huang, Yan and Singh, Param Vir , title =. Information Systems Research , ISSN =. doi:10.1287/isre.2020.0990 , year =

work page doi:10.1287/isre.2020.0990 2020
[137]

The Journal of Finance , volume =

Fuster, Andreas and Goldsmith-Pinkham, Paul and Ramadorai, Tarun and Walther, Ansgar , title =. The Journal of Finance , volume =. doi:10.1111/jofi.13090 , year =

work page doi:10.1111/jofi.13090
[138]

Management Science , DOI =

Hurlin, Christophe and Pérignon, Christophe and Saurin, Sébastien , title =. Management Science , DOI =. 2025 , type =

work page 2025
[139]

European Journal of Operational Research , volume =

Kraus, Mathias and Tschernutter, Daniel and Weinzierl, Sven and Zschech, Patrick , title =. European Journal of Operational Research , volume =. doi:10.1016/j.ejor.2023.06.032 , year =

work page doi:10.1016/j.ejor.2023.06.032 2023
[140]

European Journal of Operational Research , volume =

Carrizosa, Emilio and Kurishchenko, Kseniia and Romero Morales, Dolores , title =. European Journal of Operational Research , volume =. doi:10.1016/j.ejor.2025.01.008 , year =

work page doi:10.1016/j.ejor.2025.01.008 2025
[141]

European Journal of Operational Research , volume =

Borgonovo, Emanuele and Plischke, Elmar and Rabitti, Giovanni , title =. European Journal of Operational Research , volume =. doi:10.1016/j.ejor.2024.06.023 , year =

work page doi:10.1016/j.ejor.2024.06.023 2024
[142]

European Journal of Operational Research , volume =

Zografopoulos, Lazaros and Iannino, Maria Chiara and Psaradellis, Ioannis and Sermpinis, Georgios , title =. European Journal of Operational Research , volume =. doi:10.1016/j.ejor.2024.08.032 , year =

work page doi:10.1016/j.ejor.2024.08.032 2024
[143]

IEEE Transactions on Neural Networks and Learning Systems , volume =

Medina-Olivares, Victor and Lessmann, Stefan and Klein, Nadja , title =. IEEE Transactions on Neural Networks and Learning Systems , volume =. doi:10.1109/TNNLS.2024.3398559 , year =

work page doi:10.1109/tnnls.2024.3398559 2024
[144]

European Journal of Operational Research , volume =

Tu, Jiancheng and Wu, Zhibin , title =. European Journal of Operational Research , volume =. doi:10.1016/j.ejor.2024.10.046 , year =

work page doi:10.1016/j.ejor.2024.10.046 2024
[145]

, title =

De Caigny, Arno and Coussement, Kristof and De Bock, Koen W. , title =. European Journal of Operational Research , volume =. doi:10.1016/j.ejor.2018.02.009 , year =

work page doi:10.1016/j.ejor.2018.02.009 2018
[146]

2015 , journal =

LeCun, Yann and Bengio, Yoshua and Hinton, Geoffrey , title =. Nature , volume =. doi:10.1038/nature14539 , year =

work page doi:10.1038/nature14539

Showing first 80 references.

[1] [1]

Foundation models: A new paradigm for artificial intelligence

Johannes Schneider, Christian Meske, and Pauline Kuss. Foundation models: A new paradigm for artificial intelligence. Business & Information Systems Engineering, 66 0 (2): 0 221--231, 2024

work page 2024

[2] [3]

Learning transferable visual models from natural language supervision

Alec Radford, Jong Wook Kim, Chris Hallacy, Aditya Ramesh, Gabriel Goh, Sandhini Agarwal, Girish Sastry, Amanda Askell, Pamela Mishkin, Jack Clark, et al. Learning transferable visual models from natural language supervision. In International conference on machine learning, pages 8748--8763. PmLR, 2021

work page 2021

[3] [4]

Foundation models defining a new era in vision: a survey and outlook

Muhammad Awais, Muzammal Naseer, Salman Khan, Rao Muhammad Anwer, Hisham Cholakkal, Mubarak Shah, Ming-Hsuan Yang, and Fahad Shahbaz Khan. Foundation models defining a new era in vision: a survey and outlook. IEEE Transactions on Pattern Analysis and Machine Intelligence, 2025

work page 2025

[4] [5]

Benchmarking state-of-the-art classification algorithms for credit scoring: An update of research

Stefan Lessmann, Bart Baesens, Hsin-Vonn Seow, and Lyn C Thomas. Benchmarking state-of-the-art classification algorithms for credit scoring: An update of research. European journal of operational research, 247 0 (1): 0 124--136, 2015

work page 2015

[5] [6]

Deep learning for credit scoring: Do or don’t? European Journal of Operational Research, 295 0 (1): 0 292--305, 2021

Bj \"o rn Rafn Gunnarsson, Seppe Vanden Broucke, Bart Baesens, Mar \' a \'O skarsd \'o ttir, and Wilfried Lemahieu. Deep learning for credit scoring: Do or don’t? European Journal of Operational Research, 295 0 (1): 0 292--305, 2021

work page 2021

[6] [7]

Tabular data: Deep learning is not all you need

Ravid Shwartz-Ziv and Amitai Armon. Tabular data: Deep learning is not all you need. Information Fusion, 81: 0 84--90, 2022 a

work page 2022

[7] [9]

u ller, Lennart Purucker, Arjun Krishnakumar, Max K \

Noah Hollmann, Samuel M \"u ller, Lennart Purucker, Arjun Krishnakumar, Max K \"o rfer, Shi Bin Hoo, Robin Tibor Schirrmeister, and Frank Hutter. Accurate predictions on small data with a tabular foundation model. Nature, 637 0 (8045): 0 319--326, 2025 a

work page 2025

[8] [10]

The value of big data for credit scoring: Enhancing financial inclusion using mobile phone data and social network analytics

Mar \' a \'O skarsd \'o ttir, Cristi \'a n Bravo, Carlos Sarraute, Jan Vanthienen, and Bart Baesens. The value of big data for credit scoring: Enhancing financial inclusion using mobile phone data and social network analytics. Applied Soft Computing, 74: 0 26--39, 2019

work page 2019

[9] [11]

Fairness in credit scoring: Assessment, implementation and profit implications

Nikita Kozodoi, Johannes Jacob, and Stefan Lessmann. Fairness in credit scoring: Assessment, implementation and profit implications. European Journal of Operational Research, 297 0 (3): 0 1083--1094, 2022

work page 2022

[10] [12]

Algorithmic decision making methods for fair credit scoring

Darie Moldovan. Algorithmic decision making methods for fair credit scoring. IEEE Access, 11: 0 59729--59743, 2023

work page 2023

[11] [13]

Credit risk analytics: Measurement techniques, applications, and examples in SAS

Bart Baesens, Daniel Roesch, and Harald Scheule. Credit risk analytics: Measurement techniques, applications, and examples in SAS. John Wiley & Sons, 2016

work page 2016

[12] [14]

Benchmarking state-of-the-art classification algorithms for credit scoring

Bart Baesens, Tony Van Gestel, Stijn Viaene, Maria Stepanova, Johan Suykens, and Jan Vanthienen. Benchmarking state-of-the-art classification algorithms for credit scoring. Journal of the operational research society, 54 0 (6): 0 627--635, 2003

work page 2003

[13] [15]

Benchmarking regression algorithms for loss given default modeling

Gert Loterman, Iain Brown, David Martens, Christophe Mues, and Bart Baesens. Benchmarking regression algorithms for loss given default modeling. International Journal of Forecasting, 28 0 (1): 0 161--170, 2012

work page 2012

[14] [17]

An experimental comparison of classification algorithms for imbalanced credit scoring data sets

Iain Brown and Christophe Mues. An experimental comparison of classification algorithms for imbalanced credit scoring data sets. Expert systems with applications, 39 0 (3): 0 3446--3453, 2012

work page 2012

[15] [18]

On the suitability of resampling techniques for the class imbalance problem in credit scoring

Ana Isabel Marqu \'e s, Vicente Garc \' a, and Jos \'e Salvador S \'a nchez. On the suitability of resampling techniques for the class imbalance problem in credit scoring. Journal of the Operational Research Society, 64 0 (7): 0 1060--1070, 2013

work page 2013

[16] [21]

Reject inference, augmentation, and sample selection

John Banasik and Jonathan Crook. Reject inference, augmentation, and sample selection. European Journal of Operational Research, 183 0 (3): 0 1582--1594, 2007

work page 2007

[17] [22]

Fighting sampling bias: A framework for training and evaluating credit scoring models

Nikita Kozodoi, Stefan Lessmann, Morteza Alamgir, Luis Moreira-Matias, and Konstantinos Papakonstantinou. Fighting sampling bias: A framework for training and evaluating credit scoring models. European Journal of Operational Research, 324 0 (2): 0 616--628, 2025

work page 2025

[18] [23]

Loss given default models incorporating macroeconomic variables for credit cards

Tony Bellotti and Jonathan Crook. Loss given default models incorporating macroeconomic variables for credit cards. International Journal of Forecasting, 28 0 (1): 0 171--182, 2012

work page 2012

[19] [24]

The devil in the details: Dynamic prediction of loan portfolio profitability with macroeconomic drivers through multi-state modelling

Viani B Djeundje, Jonathan Crook, and Galina Andreeva. The devil in the details: Dynamic prediction of loan portfolio profitability with macroeconomic drivers through multi-state modelling. European Journal of Operational Research, 2025

work page 2025

[20] [27]

A survey of credit and behavioural scoring: forecasting financial risk of lending to consumers

Lyn C Thomas. A survey of credit and behavioural scoring: forecasting financial risk of lending to consumers. International journal of forecasting, 16 0 (2): 0 149--172, 2000

work page 2000

[21] [28]

Linear and nonlinear credit scoring by combining logistic regression and support vector machines

Tony Van Gestel, Bart Baesens, Peter Van Dijcke, Johan Suykens, Joao Garcia, and Thomas Alderweireld. Linear and nonlinear credit scoring by combining logistic regression and support vector machines. Journal of credit Risk, 1 0 (4), 2005

work page 2005

[22] [33]

P2p network lending, loss given default and credit risks

Guangyou Zhou, Yijia Zhang, and Sumei Luo. P2p network lending, loss given default and credit risks. Sustainability, 10 0 (4): 0 1010, 2018. ISSN 2071-1050

work page 2018

[23] [46]

Credit scoring for profitability objectives

Steven Finlay. Credit scoring for profitability objectives. European Journal of Operational Research, 202 0 (2): 0 528–537, 2010

work page 2010

[24] [66]

Shapley values as an interpretability technique in credit scoring

Hendrik Andries du Toit, Willem Dani \ A , Helgard Raubenheimer, et al. Shapley values as an interpretability technique in credit scoring. Journal of Risk Model Validation, 2023

work page 2023

[25] [76]

The fairness of credit scoring models

Christophe Hurlin, Christophe Pérignon, and Sébastien Saurin. The fairness of credit scoring models. Management Science, 2025. doi:10.1287/mnsc.2022.03888

work page doi:10.1287/mnsc.2022.03888 2025

[26] [80]

Why do tree-based models still outperform deep learning on typical tabular data? In Advances in Neural Information Processing Systems, 2022

Léo Grinsztajn, Edouard Oyallon, and Gaël Varoquaux. Why do tree-based models still outperform deep learning on typical tabular data? In Advances in Neural Information Processing Systems, 2022

work page 2022

[27] [82]

Deep Learning in Banking: Integrating Artificial Intelligence for Next-Generation Financial Services

Cristian Bravo, Sebastian Maldonado, and Maria Oskarsdottir. Deep Learning in Banking: Integrating Artificial Intelligence for Next-Generation Financial Services. John Wiley & Sons, 2026

work page 2026

[28] [84]

Revisiting deep learning models for tabular data

Yury Gorishniy, Ivan Rubachev, Valentin Khrulkov, and Artem Babenko. Revisiting deep learning models for tabular data. Advances in neural information processing systems, 34: 0 18932--18943, 2021

work page 2021

[29] [89]

VIME: extending the success of self- and semi-supervised learning to tabular domain

Jinsung Yoon, Yao Zhang, James Jordon, and Mihaela van der Schaar. VIME: extending the success of self- and semi-supervised learning to tabular domain. In Hugo Larochelle, Marc'Aurelio Ranzato, Raia Hadsell, Maria - Florina Balcan, and Hsuan - Tien Lin, editors, Advances in Neural Information Processing Systems 33: Annual Conference on Neural Information ...

work page 2020

[30] [90]

Subtab: Subsetting features of tabular data for self-supervised representation learning

Talip Ucar, Ehsan Hajiramezanali, and Lindsay Edwards. Subtab: Subsetting features of tabular data for self-supervised representation learning. In Marc'Aurelio Ranzato, Alina Beygelzimer, Yann N. Dauphin, Percy Liang, and Jennifer Wortman Vaughan, editors, Advances in Neural Information Processing Systems 34: Annual Conference on Neural Information Proces...

work page 2021

[31] [92]

Stefan Hegselmann, Alejandro Buendia, Hunter Lang, Monica Agrawal, Xiaoyi Jiang, and David A. Sontag. Tabllm: Few-shot classification of tabular data with large language models. In Francisco J. R. Ruiz, Jennifer G. Dy, and Jan - Willem van de Meent, editors, International Conference on Artificial Intelligence and Statistics, 25-27 April 2023, Palau de Con...

work page 2023

[32] [93]

Tabpfn: A transformer that solves small tabular classification problems in a second

Noah Hollmann, Samuel M \" u ller, Katharina Eggensperger, and Frank Hutter. Tabpfn: A transformer that solves small tabular classification problems in a second. In The Eleventh International Conference on Learning Representations, ICLR 2023, Kigali, Rwanda, May 1-5, 2023 . OpenReview.net, 2023. URL https://openreview.net/forum?id=cp5PvcI6w8\_

work page 2023

[33] [94]

Transformers can do bayesian inference

Samuel M \" u ller, Noah Hollmann, Sebastian Pineda - Arango, Josif Grabocka, and Frank Hutter. Transformers can do bayesian inference. In The Tenth International Conference on Learning Representations, ICLR 2022, Virtual Event, April 25-29, 2022 . OpenReview.net, 2022. URL https://openreview.net/forum?id=KSugKcbNf9

work page 2022

[34] [97]

TabPFN-2.5: Advancing the State of the Art in Tabular Foundation Models

Léo Grinsztajn, Klemens Flöge, Oscar Key, Felix Birkel, Philipp Jund, Brendan Roof, Benjamin Jäger, Dominik Safaric, Simone Alessi, Adrian Hayler, Mihir Manium, Rosen Yu, Felix Jablonski, Shi Bin Hoo, Anurag Garg, Jake Robertson, Magnus Bühler, Vladyslav Moroshan, Lennart Purucker, Clara Cornu, Lilly Charlotte Wehrhahn, Alessandro Bonetto, Bernhard Schölk...

work page internal anchor Pith review Pith/arXiv arXiv doi:10.48550/arxiv.2511.08667 2026

[35] [99]

Thomas, David B

Lyn C. Thomas, David B. Edelman, and Jonathan N. Crook. Credit Scoring and its Applications. Siam, Philadelphia, 2002

work page 2002

[36] [102]

Statistical comparisons of classifiers over multiple data sets

Janez Demšar. Statistical comparisons of classifiers over multiple data sets. Journal of Machine Learning Research, 7: 0 1–30, 2006

work page 2006

[37] [103]

Individual comparisons by ranking methods

Frank Wilcoxon. Individual comparisons by ranking methods. Biometrics bulletin, 1 0 (6): 0 80--83, 1945

work page 1945

[38] [104]

Do we need hundreds of classifiers to solve real world classification problems? Journal of Machine Learning Research, 15 0 (90): 0 3133--3181, 2014

Manuel Fern \'a ndez-Delgado, Eva Cernadas, Sen \'e n Barro, and Dinani Amorim. Do we need hundreds of classifiers to solve real world classification problems? Journal of Machine Learning Research, 15 0 (90): 0 3133--3181, 2014. URL http://jmlr.org/papers/v15/delgado14a.html

work page 2014

[39] [105]

A comparison of alternative tests of significance for the problem of m rankings

Milton Friedman. A comparison of alternative tests of significance for the problem of m rankings. The annals of mathematical statistics, 11 0 (1): 0 86--92, 1940

work page 1940

[40] [106]

Approximations of the critical region of the fbietkan statistic

Ronald L Iman and James M Davenport. Approximations of the critical region of the fbietkan statistic. Communications in Statistics-Theory and Methods, 9 0 (6): 0 571--595, 1980

work page 1980

[41] [107]

A simple sequentially rejective multiple test procedure

Sture Holm. A simple sequentially rejective multiple test procedure. Scandinavian journal of statistics, pages 65--70, 1979

work page 1979

[42] [108]

statistical comparisons of classifiers over multiple data sets

Salvador Garcia and Francisco Herrera. An extension on" statistical comparisons of classifiers over multiple data sets" for all pairwise comparisons. Journal of Machine Learning Research, 9 0 (12), 2008

work page 2008

[43] [109]

2026 , publisher=

Deep Learning in Banking: Integrating Artificial Intelligence for Next-Generation Financial Services , author=. 2026 , publisher=

work page 2026

[44] [110]

Applied Soft Computing , volume=

The value of big data for credit scoring: Enhancing financial inclusion using mobile phone data and social network analytics , author=. Applied Soft Computing , volume=. 2019 , publisher=

work page 2019

[45] [111]

European Journal of Operational Research , volume=

Deep learning for credit scoring: Do or don’t? , author=. European Journal of Operational Research , volume=. 2021 , publisher=

work page 2021

[46] [112]

IEEE Transactions on Pattern Analysis and Machine Intelligence , year=

Foundation models defining a new era in vision: a survey and outlook , author=. IEEE Transactions on Pattern Analysis and Machine Intelligence , year=

work page

[47] [113]

Journal of the operational research society , volume=

Benchmarking state-of-the-art classification algorithms for credit scoring , author=. Journal of the operational research society , volume=. 2003 , publisher=

work page 2003

[48] [114]

2016 , publisher=

Credit risk analytics: Measurement techniques, applications, and examples in SAS , author=. 2016 , publisher=

work page 2016

[49] [115]

Do we Need Hundreds of Classifiers to Solve Real World Classification Problems? , journal =

Manuel Fern. Do we Need Hundreds of Classifiers to Solve Real World Classification Problems? , journal =. 2014 , volume =

work page 2014

[50] [116]

Journal of Machine Learning Research , volume =

Demšar, Janez , title =. Journal of Machine Learning Research , volume =. 2006 , type =

work page 2006

[51] [117]

TabICL: A Tabular Foundation Model for In-Context Learning on Large Data

Qu, Jingang and Holzmüller, David and Varoquaux, Gaël and Le Morvan, Marine , title =. arXiv preprint , volume =. doi:10.48550/arXiv.2502.05564 , year =

work page internal anchor Pith review doi:10.48550/arxiv.2502.05564

[52] [118]

ArXiv preprint , volume =

Ye, Han-Jia and Liu, Si-Yang and Chao, Wei-Lun , title =. ArXiv preprint , volume =. doi:10.48550/arXiv.2502.17361 , year =

work page doi:10.48550/arxiv.2502.17361

[53] [119]

Takuya Akiba and Shotaro Sano and Toshihiko Yanase and Takeru Ohta and Masanori Koyama , editor =. Optuna:. Proceedings of the 25th. 2019 , url =. doi:10.1145/3292500.3330701 , timestamp =

work page doi:10.1145/3292500.3330701 2019

[54] [120]

European Journal of Operational Research , volume =

Zandi, Sahab and Korangi, Kamesh and Óskarsdóttir, María and Mues, Christophe and Bravo, Cristián , title =. European Journal of Operational Research , volume =. doi:10.1016/j.ejor.2024.09.025 , year =

work page doi:10.1016/j.ejor.2024.09.025 2024

[55] [121]

Omega , volume =

Óskarsdóttir, María and Bravo, Cristián , title =. Omega , volume =. doi:10.1016/j.omega.2021.102520 , year =

work page doi:10.1016/j.omega.2021.102520 2021

[56] [122]

European Journal of Operational Research , volume =

Calabrese, Raffaella and Crook, Jonathan , title =. European Journal of Operational Research , volume =. doi:10.1016/j.ejor.2020.04.031 , year =

work page doi:10.1016/j.ejor.2020.04.031 2020

[57] [123]

European Journal of Operational Research , volume =

Medina-Olivares, Victor and Lindgren, Finn and Calabrese, Raffaella and Crook, Jonathan , title =. European Journal of Operational Research , volume =. doi:10.1016/j.ejor.2025.07.060 , year =

work page doi:10.1016/j.ejor.2025.07.060 2025

[58] [124]

European Journal of Operational Research , volume =

Shi, Yong and Qu, Yi and Chen, Zhensong and Mi, Yunlong and Wang, Yunong , title =. European Journal of Operational Research , volume =. doi:10.1016/j.ejor.2023.12.028 , year =

work page doi:10.1016/j.ejor.2023.12.028 2023

[59] [125]

European Journal of Operational Research , volume =

Li, Yibei and Wang, Ximei and Djehiche, Boualem and Hu, Xiaoming , title =. European Journal of Operational Research , volume =. doi:10.1016/j.ejor.2020.03.078 , year =

work page doi:10.1016/j.ejor.2020.03.078 2020

[60] [126]

Journal of the Operational Research Society , pages =

De Cnudde, Sofie and Moeyersoms, Julie and Stankova, Marija and Tobback, Ellen and Javaly, Vinayak and Martens, David , title =. Journal of the Operational Research Society , pages =. doi:10.1080/01605682.2018.1434402 , year =

work page doi:10.1080/01605682.2018.1434402 2018

[61] [127]

and Crook, Jonathan and Calabrese, Raffaella and Hamid, Mona , title =

Djeundje, Viani B. and Crook, Jonathan and Calabrese, Raffaella and Hamid, Mona , title =. Expert Systems with Applications , volume =. doi:10.1016/j.eswa.2020.113766 , year =

work page doi:10.1016/j.eswa.2020.113766 2020

[62] [128]

Journal of Business and Economic Statistics , volume =

Dirick, Lore and Bellotti, Tony and Claeskens, Gerda and Baesens, Bart , title =. Journal of Business and Economic Statistics , volume =. doi:10.1080/07350015.2016.1260471 , year =

work page doi:10.1080/07350015.2016.1260471 2016

[63] [129]

and De Caigny, Arno and Lessmann, Stefan , title =

Mena, Gary and Coussement, Kristof and De Bock, Koen W. and De Caigny, Arno and Lessmann, Stefan , title =. Annals of Operations Research , volume =. doi:10.1007/s10479-023-05259-9 , year =

work page doi:10.1007/s10479-023-05259-9

[64] [130]

Advances in neural information processing systems , volume=

Revisiting deep learning models for tabular data , author=. Advances in neural information processing systems , volume=

work page

[65] [131]

ArXiv preprint , volume =

Jiang, Jun-Peng and Liu, Si-Yang and Cai, Hao-Run and Zhou, Qile and Ye, Han-Jia , title =. ArXiv preprint , volume =. doi:10.48550/arXiv.2504.16109 , year =

work page doi:10.48550/arxiv.2504.16109

[66] [132]

European Journal of Operational Research , volume =

Korangi, Kamesh and Mues, Christophe and Bravo, Cristián , title =. European Journal of Operational Research , volume =. doi:10.1016/j.ejor.2022.10.032 , year =

work page doi:10.1016/j.ejor.2022.10.032 2022

[67] [133]

European Journal of Operational Research , volume =

Kriebel, Johannes and Stitz, Lennart , title =. European Journal of Operational Research , volume =. doi:10.1016/j.ejor.2021.12.024 , year =

work page doi:10.1016/j.ejor.2021.12.024 2021

[68] [134]

European Journal of Operational Research , volume =

Stevenson, Matthew and Mues, Christophe and Bravo, Cristián , title =. European Journal of Operational Research , volume =. doi:10.1016/j.ejor.2021.03.008 , year =

work page doi:10.1016/j.ejor.2021.03.008 2021

[69] [135]

European Journal of Operational Research , volume =

Wu, Zongxiao and Dong, Yizhe and Li, Yaoyiran and Shi, Baofeng , title =. European Journal of Operational Research , volume =. doi:10.1016/j.ejor.2025.04.032 , year =

work page doi:10.1016/j.ejor.2025.04.032 2025

[70] [136]

Information Systems Research , ISSN =

Fu, Runshan and Huang, Yan and Singh, Param Vir , title =. Information Systems Research , ISSN =. doi:10.1287/isre.2020.0990 , year =

work page doi:10.1287/isre.2020.0990 2020

[71] [137]

The Journal of Finance , volume =

Fuster, Andreas and Goldsmith-Pinkham, Paul and Ramadorai, Tarun and Walther, Ansgar , title =. The Journal of Finance , volume =. doi:10.1111/jofi.13090 , year =

work page doi:10.1111/jofi.13090

[72] [138]

Management Science , DOI =

Hurlin, Christophe and Pérignon, Christophe and Saurin, Sébastien , title =. Management Science , DOI =. 2025 , type =

work page 2025

[73] [139]

European Journal of Operational Research , volume =

Kraus, Mathias and Tschernutter, Daniel and Weinzierl, Sven and Zschech, Patrick , title =. European Journal of Operational Research , volume =. doi:10.1016/j.ejor.2023.06.032 , year =

work page doi:10.1016/j.ejor.2023.06.032 2023

[74] [140]

European Journal of Operational Research , volume =

Carrizosa, Emilio and Kurishchenko, Kseniia and Romero Morales, Dolores , title =. European Journal of Operational Research , volume =. doi:10.1016/j.ejor.2025.01.008 , year =

work page doi:10.1016/j.ejor.2025.01.008 2025

[75] [141]

European Journal of Operational Research , volume =

Borgonovo, Emanuele and Plischke, Elmar and Rabitti, Giovanni , title =. European Journal of Operational Research , volume =. doi:10.1016/j.ejor.2024.06.023 , year =

work page doi:10.1016/j.ejor.2024.06.023 2024

[76] [142]

European Journal of Operational Research , volume =

Zografopoulos, Lazaros and Iannino, Maria Chiara and Psaradellis, Ioannis and Sermpinis, Georgios , title =. European Journal of Operational Research , volume =. doi:10.1016/j.ejor.2024.08.032 , year =

work page doi:10.1016/j.ejor.2024.08.032 2024

[77] [143]

IEEE Transactions on Neural Networks and Learning Systems , volume =

Medina-Olivares, Victor and Lessmann, Stefan and Klein, Nadja , title =. IEEE Transactions on Neural Networks and Learning Systems , volume =. doi:10.1109/TNNLS.2024.3398559 , year =

work page doi:10.1109/tnnls.2024.3398559 2024

[78] [144]

European Journal of Operational Research , volume =

Tu, Jiancheng and Wu, Zhibin , title =. European Journal of Operational Research , volume =. doi:10.1016/j.ejor.2024.10.046 , year =

work page doi:10.1016/j.ejor.2024.10.046 2024

[79] [145]

, title =

De Caigny, Arno and Coussement, Kristof and De Bock, Koen W. , title =. European Journal of Operational Research , volume =. doi:10.1016/j.ejor.2018.02.009 , year =

work page doi:10.1016/j.ejor.2018.02.009 2018

[80] [146]

2015 , journal =

LeCun, Yann and Bengio, Yoshua and Hinton, Geoffrey , title =. Nature , volume =. doi:10.1038/nature14539 , year =

work page doi:10.1038/nature14539