Tabular foundation models for in-context prediction of molecular properties

· 2026 · cs.LG · arXiv 2604.16123

2 Pith papers cite this work. Polarity classification is still indexing.

2 Pith papers citing it

open full Pith review browse 2 citing papers arXiv PDF

abstract

Accurate molecular property prediction is central to drug discovery, catalysis, and process design, yet real-world applications are often limited by small datasets. Molecular foundation models provide a promising direction by learning transferable molecular representations; however, they typically involve task-specific fine-tuning, require machine learning expertise, and often fail to outperform classical baselines. Tabular foundation models (TFMs) offer a fundamentally different paradigm: they perform predictions through in-context learning, enabling inference without task-specific training. Here, we evaluate TFMs in the low- to medium-data regime across both standardized pharmaceutical benchmarks and chemical engineering datasets. We evaluate both frozen molecular foundation model representations, as well as classical descriptors and fingerprints. Across the benchmarks, the approach shows excellent predictive performance while reducing computational cost, compared to fine-tuning, with these advantages also transferring to practical engineering data settings. In particular, combining TFMs with CheMeleon embeddings yields up to 100\% win rates on 30 MoleculeACE tasks, while compact RDKit2d and Mordred descriptors provide strong descriptor-based alternatives. Molecular representation emerges as a key determinant in TFM performance, with molecular foundation model embeddings and 2D descriptor sets both providing substantial gains over classic molecular fingerprints on many tasks. These results suggest that in-context learning with TFMs provides a highly accurate and cost-efficient alternative for property prediction in practical applications.

citation-role summary

background 1

citation-polarity summary

unclear 1

representative citing papers

TabPFN-3: Technical Report

cs.LG · 2026-05-13 · unverdicted · novelty 6.0 · 2 refs

TabPFN-3 scales tabular foundation models to 1M rows with synthetic pretraining, test-time compute, and benchmark-leading performance on tabular, relational, and tabular-text tasks while being up to 20x faster than TabPFN-2.5.

When Tabular Foundation Models Transfer Across Modalities: A Systematic Evaluation Across 95 Datasets, 7 Modalities, and Two Regimes

cs.LG · 2026-06-01 · unverdicted · novelty 5.0

A tabular foundation model pipeline with ETF preprocessing transfers across 7 modalities on 95 datasets, matching lightweight tuned baselines on frozen features at much higher speed while providing calibration for deployment.

citing papers explorer

Showing 2 of 2 citing papers.

TabPFN-3: Technical Report cs.LG · 2026-05-13 · unverdicted · none · ref 88 · 2 links · internal anchor
TabPFN-3 scales tabular foundation models to 1M rows with synthetic pretraining, test-time compute, and benchmark-leading performance on tabular, relational, and tabular-text tasks while being up to 20x faster than TabPFN-2.5.
When Tabular Foundation Models Transfer Across Modalities: A Systematic Evaluation Across 95 Datasets, 7 Modalities, and Two Regimes cs.LG · 2026-06-01 · unverdicted · none · ref 3 · internal anchor
A tabular foundation model pipeline with ETF preprocessing transfers across 7 modalities on 95 datasets, matching lightweight tuned baselines on frozen features at much higher speed while providing calibration for deployment.

Tabular foundation models for in-context prediction of molecular properties

citation-role summary

citation-polarity summary

fields

years

verdicts

roles

polarities

representative citing papers

citing papers explorer