A task-agnostic encoder with task-specific decoders enables in-context learning across classification, regression, anomaly detection, clustering, entity matching, and entity classification on tabular data, achieving SOTA on several tasks.
TabClustPFN: A Prior-Fitted Network for Tabular Data Clustering
2 Pith papers cite this work. Polarity classification is still indexing.
abstract
Clustering tabular data is a fundamental yet challenging problem due to heterogeneous feature types, diverse data-generating mechanisms, and the absence of transferable inductive biases across datasets. Prior-fitted networks (PFNs) have recently demonstrated strong generalization in supervised tabular learning by amortizing Bayesian inference under a broad synthetic prior. Extending this paradigm to clustering is nontrivial: clustering is unsupervised, admits a combinatorial and permutation-invariant output space, and requires inferring the number of clusters. We introduce TabClustPFN, a prior-fitted network for tabular data clustering that performs amortized Bayesian inference over both cluster assignments and cluster cardinality. Pretrained on synthetic datasets drawn from a flexible clustering prior, TabClustPFN clusters unseen datasets in a single forward pass, without dataset-specific retraining or hyperparameter tuning. The model naturally handles heterogeneous numerical and categorical features and adapts to a wide range of clustering structures. Experiments on synthetic data and curated real-world tabular benchmarks show that TabClustPFN outperforms classical, deep, and amortized clustering baselines, while exhibiting strong robustness in out-of-the-box exploratory settings. Code is available at https://github.com/Tianqi-Zhao/TabClustPFN.
fields
cs.LG 2years
2026 2verdicts
UNVERDICTED 2representative citing papers
LUCoS replaces raw tabular geometry with unsupervised PFN latent embeddings for medoid-based context selection and ranks first on mean AUC, ACC, and F1 across 67 datasets and six budgets.
citing papers explorer
-
FlexTab: A Flexible Encoder-Decoder Architecture for In-Context Learning Across Diverse Tabular Tasks
A task-agnostic encoder with task-specific decoders enables in-context learning across classification, regression, anomaly detection, clustering, entity matching, and entity classification on tabular data, achieving SOTA on several tasks.
-
LUCoS: Latent Unsupervised Context Selection for Tabular Foundation Models
LUCoS replaces raw tabular geometry with unsupervised PFN latent embeddings for medoid-based context selection and ranks first on mean AUC, ACC, and F1 across 67 datasets and six budgets.