GOTabPFN combines GO-LR ordering (equivalent to weighted minimum linear arrangement) and NSC compression to enable practical TabPFN-style prediction on HDLSS tabular data under tight token budgets, improving stability and accuracy.
ZAYAN: Disentangled Contrastive Transformer for Tabular Remote Sensing Data
1 Pith paper cite this work. Polarity classification is still indexing.
abstract
Learning informative representations from tabular data in remote sensing and environmental science is challenging due to heterogeneity, scarce labels, and redundancy among features. We present ZAYAN (Zero-Anchor dYnamic feAture eNcoding), a self-supervised, feature-centric contrastive framework for tabular data. ZAYAN performs contrastive learning at the feature rather than sample level, removing the need for explicit anchor selection and any reliance on class labels, while encouraging a redundancy-minimized, disentangled embedding space. The framework has two modules: ZAYAN-CL, which pretrains feature embeddings via a zero-anchor contrastive objective with dynamic perturbations and masking, and ZAYAN-T, a Transformer that conditions on these embeddings for downstream classification. Across eight datasets, including six remote-sensing tabular benchmarks and two remote-sensing-driven flood-prediction tables from satellite and GIS products, ZAYAN achieves superior accuracy, robustness, and generalization over tabular deep learning baselines, with consistent gains under label scarcity and distribution shift. These results indicate that feature-level contrastive learning and dynamic feature encoding provide an effective recipe for learning from tabular sensing data.
fields
cs.LG 1years
2026 1verdicts
UNVERDICTED 1representative citing papers
citing papers explorer
-
GOTabPFN: From Feature Ordering to Compact Tokenization for Tabular Foundation Models on High-Dimensional Data
GOTabPFN combines GO-LR ordering (equivalent to weighted minimum linear arrangement) and NSC compression to enable practical TabPFN-style prediction on HDLSS tabular data under tight token budgets, improving stability and accuracy.