WSADBench unifies WSAD evaluation across three supervision types, runs 700K experiments on 36 algorithms and 4 modalities, and finds strong correlations between scenarios plus performance boundaries favoring general models except in extreme label scarcity.
Orion-MSP: Multi-scale sparse attention for tabular in-context learning
9 Pith papers cite this work. Polarity classification is still indexing.
citation-role summary
citation-polarity summary
fields
cs.LG 9years
2026 9roles
background 2polarities
background 2representative citing papers
MulTaBench is a new collection of 40 image-tabular and text-tabular datasets designed to test target-aware representation tuning in multimodal tabular models.
TFM-Retouche is an architecture-agnostic input-space residual adapter that improves tabular foundation model accuracy on 51 datasets by learning input corrections through the frozen backbone, with an identity guard to fall back to the original model.
O'Prior, a compositional synthetic prior with hierarchical SCMs, realism engines, stress modules, and curriculum protocols, improves tabular foundation model accuracy and robustness on real benchmarks when architecture and compute are held fixed.
Distilling TabICLv2 into XGBoost via stratified OOF labeling yields 0.882 macro-mean AUC (96.5% of teacher) at 1.9 ms CPU across 153 datasets, with significant gains over tuned CatBoost on low-dimensional data.
Leakage-aware distillation transfers at least 90% of tabular foundation model AUC to lightweight students across 19 health datasets, with 26x CPU speedup and preserved calibration/fairness.
Six modern tabular foundation models are near-redundant, limiting ensemble gains to +0.18% accuracy at high cost while some methods degrade calibration.
Context construction strategies such as balanced sampling improve AUC-ROC by 3-4 points over uniform sampling in tabular foundation models for credit risk, exceeding differences between model families and matching classical baselines.
TabPFN maintains high ROC-AUC and structured attention under controlled additions of irrelevant features, nonlinear correlations, and mislabeled targets in binary classification.
citing papers explorer
-
Rethinking Weak Supervision in Anomaly Detection: A Comprehensive Benchmark
WSADBench unifies WSAD evaluation across three supervision types, runs 700K experiments on 36 algorithms and 4 modalities, and finds strong correlations between scenarios plus performance boundaries favoring general models except in extreme label scarcity.
-
Pocket Foundation Models: Distilling TFMs into CPU-Ready Gradient-Boosted Trees
Distilling TabICLv2 into XGBoost via stratified OOF labeling yields 0.882 macro-mean AUC (96.5% of teacher) at 1.9 ms CPU across 153 datasets, with significant gains over tuned CatBoost on low-dimensional data.