MacrOData supplies three large, curated benchmark suites totaling 2,446 datasets for tabular outlier detection, complete with standardized splits, metadata, and a public leaderboard.
Real-tabpfn: Im- proving tabular foundation models via continued pre-training with real-world data.arXiv preprint arXiv:2507.03971,
9 Pith papers cite this work. Polarity classification is still indexing.
citation-role summary
citation-polarity summary
roles
background 1polarities
background 1representative citing papers
TabArena launches a dynamic, updatable benchmarking system for tabular ML that shows boosted trees remain competitive, deep learning matches them under larger budgets with ensembling, foundation models excel on small data, and cross-model ensembles advance SOTA while flagging validation overfitting.
DiffICL breaks the quality-privacy tradeoff in small-data tabular synthesis by using in-context learning on pretrained structural priors to generate data that is both higher quality and less memorizing of training samples.
Tabular foundation models achieve high accuracy in molecular property prediction through in-context learning, with up to 100% win rates on MoleculeACE tasks when paired with CheMeleon embeddings.
KumoRFM-2 pre-trains on synthetic and real relational data across row, column, foreign-key and cross-sample axes, injects task information early, and achieves up to 8% gains over supervised baselines on 41 benchmarks in few-shot and fine-tuned regimes while handling billion-scale datasets.
FEAT is a linear-complexity structured data foundation model using dual-axis encoding, AFBM state-space models, and Conv-GLA to achieve O(N) scaling and permutation invariance while outperforming prior SFMs on real-world benchmarks.
TabPFN-2.5 scales tabular foundation models to 20x larger datasets, outperforms tuned tree models on TabArena, achieves near-perfect win rates against default XGBoost, and adds a distillation engine for fast production deployment.
The paper proposes Strategic Prior-data Fitted Network (SPN), an inference-time method that adapts pretrained tabular foundation models to strategic feature manipulation by constructing aligned in-context examples.
Six modern tabular foundation models are near-redundant, limiting ensemble gains to +0.18% accuracy at high cost while some methods degrade calibration.
citing papers explorer
-
MacrOData: New Benchmarks of Thousands of Datasets for Tabular Outlier Detection
MacrOData supplies three large, curated benchmark suites totaling 2,446 datasets for tabular outlier detection, complete with standardized splits, metadata, and a public leaderboard.
-
TabArena: A Living Benchmark for Machine Learning on Tabular Data
TabArena launches a dynamic, updatable benchmarking system for tabular ML that shows boosted trees remain competitive, deep learning matches them under larger budgets with ensembling, foundation models excel on small data, and cross-model ensembles advance SOTA while flagging validation overfitting.
-
Breaking the Quality-Privacy Tradeoff in Tabular Data Generation via In-Context Learning
DiffICL breaks the quality-privacy tradeoff in small-data tabular synthesis by using in-context learning on pretrained structural priors to generate data that is both higher quality and less memorizing of training samples.
-
Tabular foundation models for in-context prediction of molecular properties
Tabular foundation models achieve high accuracy in molecular property prediction through in-context learning, with up to 100% win rates on MoleculeACE tasks when paired with CheMeleon embeddings.
-
KumoRFM-2: Scaling Foundation Models for Relational Learning
KumoRFM-2 pre-trains on synthetic and real relational data across row, column, foreign-key and cross-sample axes, injects task information early, and achieves up to 8% gains over supervised baselines on 41 benchmarks in few-shot and fine-tuned regimes while handling billion-scale datasets.
-
FEAT: A Linear-Complexity Foundation Model for Extremely Large Structured Data
FEAT is a linear-complexity structured data foundation model using dual-axis encoding, AFBM state-space models, and Conv-GLA to achieve O(N) scaling and permutation invariance while outperforming prior SFMs on real-world benchmarks.
-
TabPFN-2.5: Advancing the State of the Art in Tabular Foundation Models
TabPFN-2.5 scales tabular foundation models to 20x larger datasets, outperforms tuned tree models on TabArena, achieves near-perfect win rates against default XGBoost, and adds a distillation engine for fast production deployment.
-
When Tabular Foundation Models Meet Strategic Tabular Data: A Prior Alignment Approach
The paper proposes Strategic Prior-data Fitted Network (SPN), an inference-time method that adapts pretrained tabular foundation models to strategic feature manipulation by constructing aligned in-context examples.
-
Ensembling Tabular Foundation Models - A Diversity Ceiling And A Calibration Trap
Six modern tabular foundation models are near-redundant, limiting ensemble gains to +0.18% accuracy at high cost while some methods degrade calibration.