hub

arXiv preprint arXiv:1708.03731 , year=

· 2021 · arXiv 1708.03731

10 Pith papers cite this work. Polarity classification is still indexing.

10 Pith papers citing it

read on arXiv browse 10 citing papers

hub tools

JSON dossier citing papers JSON arXiv source

citation-role summary

background 1 dataset 1

citation-polarity summary

background 1 use dataset 1

representative citing papers

MacrOData: New Benchmarks of Thousands of Datasets for Tabular Outlier Detection

cs.LG · 2026-02-10 · accept · novelty 8.0

MacrOData supplies three large, curated benchmark suites totaling 2,446 datasets for tabular outlier detection, complete with standardized splits, metadata, and a public leaderboard.

TabArena: A Living Benchmark for Machine Learning on Tabular Data

cs.LG · 2025-06-20 · conditional · novelty 8.0

TabArena launches a dynamic, updatable benchmarking system for tabular ML that shows boosted trees remain competitive, deep learning matches them under larger budgets with ensembling, foundation models excel on small data, and cross-model ensembles advance SOTA while flagging validation overfitting.

Data Language Models: A New Foundation Model Class for Tabular Data

cs.AI · 2026-05-07 · unverdicted · novelty 7.0

Schema-1 is the first Data Language Model that natively understands raw tabular data and outperforms gradient-boosted ensembles, AutoML, and prior tabular foundation models on row-level prediction and imputation tasks.

TabEmbed: Benchmarking and Learning Generalist Embeddings for Tabular Understanding

cs.CL · 2026-05-06 · unverdicted · novelty 7.0

TabEmbed is the first generalist embedding model for tabular data that unifies classification and retrieval in one space via contrastive learning and outperforms text embedding models on the new TabBench benchmark.

Ternary Decision Trees with Locally-Adaptive Uncertainty Zones

cs.LG · 2026-05-21 · unverdicted · novelty 6.0

Ternary decision trees with locally-adaptive uncertainty zones estimated from CART statistics improve decided accuracy over standard trees by blending boundary predictions and flagging uncertain cases.

Shaping the Prior: How Synthetic Task Distributions Determine Tabular Foundation Model Quality

cs.LG · 2026-05-18 · unverdicted · novelty 6.0

O'Prior, a compositional synthetic prior with hierarchical SCMs, realism engines, stress modules, and curriculum protocols, improves tabular foundation model accuracy and robustness on real benchmarks when architecture and compute are held fixed.

Active Tabular Augmentation via Policy-Guided Diffusion Inpainting

cs.LG · 2026-05-11 · unverdicted · novelty 6.0

TAP couples a learner-conditioned policy with diffusion inpainting to generate and selectively inject high-utility tabular augmentations, yielding up to 15.6 pp accuracy gains and 32% RMSE reduction on seven datasets under severe scarcity.

Prior-Aligned Data Cleaning for Tabular Foundation Models

cs.LG · 2026-04-28 · unverdicted · novelty 6.0

L2C2 is a deep RL framework that learns to clean tabular data by aligning it to the synthetic prior of tabular foundation models, yielding higher accuracy on some benchmarks and cross-dataset policy transfer.

Two-stage Optimization for Machine Learning Workflow

cs.LG · 2019-07-01 · unverdicted · novelty 4.0

Two-stage optimization for ML workflows that prioritizes data pipeline search over hyperparameter tuning, with time-allocation policies and a specificity metric for pruning.

Mitigating Label Shift in Tabular In-Context Learning via Test-Time Posterior Adjustment

cs.LG · 2026-05-06

citing papers explorer

Showing 10 of 10 citing papers.

MacrOData: New Benchmarks of Thousands of Datasets for Tabular Outlier Detection cs.LG · 2026-02-10 · accept · none · ref 5
MacrOData supplies three large, curated benchmark suites totaling 2,446 datasets for tabular outlier detection, complete with standardized splits, metadata, and a public leaderboard.
TabArena: A Living Benchmark for Machine Learning on Tabular Data cs.LG · 2025-06-20 · conditional · none · ref 29
TabArena launches a dynamic, updatable benchmarking system for tabular ML that shows boosted trees remain competitive, deep learning matches them under larger budgets with ensembling, foundation models excel on small data, and cross-model ensembles advance SOTA while flagging validation overfitting.
Data Language Models: A New Foundation Model Class for Tabular Data cs.AI · 2026-05-07 · unverdicted · none · ref 1
Schema-1 is the first Data Language Model that natively understands raw tabular data and outperforms gradient-boosted ensembles, AutoML, and prior tabular foundation models on row-level prediction and imputation tasks.
TabEmbed: Benchmarking and Learning Generalist Embeddings for Tabular Understanding cs.CL · 2026-05-06 · unverdicted · none · ref 3
TabEmbed is the first generalist embedding model for tabular data that unifies classification and retrieval in one space via contrastive learning and outperforms text embedding models on the new TabBench benchmark.
Ternary Decision Trees with Locally-Adaptive Uncertainty Zones cs.LG · 2026-05-21 · unverdicted · none · ref 3
Ternary decision trees with locally-adaptive uncertainty zones estimated from CART statistics improve decided accuracy over standard trees by blending boundary predictions and flagging uncertain cases.
Shaping the Prior: How Synthetic Task Distributions Determine Tabular Foundation Model Quality cs.LG · 2026-05-18 · unverdicted · none · ref 24
O'Prior, a compositional synthetic prior with hierarchical SCMs, realism engines, stress modules, and curriculum protocols, improves tabular foundation model accuracy and robustness on real benchmarks when architecture and compute are held fixed.
Active Tabular Augmentation via Policy-Guided Diffusion Inpainting cs.LG · 2026-05-11 · unverdicted · none · ref 59
TAP couples a learner-conditioned policy with diffusion inpainting to generate and selectively inject high-utility tabular augmentations, yielding up to 15.6 pp accuracy gains and 32% RMSE reduction on seven datasets under severe scarcity.
Prior-Aligned Data Cleaning for Tabular Foundation Models cs.LG · 2026-04-28 · unverdicted · none · ref 6
L2C2 is a deep RL framework that learns to clean tabular data by aligning it to the synthetic prior of tabular foundation models, yielding higher accuracy on some benchmarks and cross-dataset policy transfer.
Two-stage Optimization for Machine Learning Workflow cs.LG · 2019-07-01 · unverdicted · none · ref 37
Two-stage optimization for ML workflows that prioritizes data pipeline search over hyperparameter tuning, with time-allocation policies and a specificity metric for pruning.
Mitigating Label Shift in Tabular In-Context Learning via Test-Time Posterior Adjustment cs.LG · 2026-05-06 · unreviewed · ref 2

arXiv preprint arXiv:1708.03731 , year=

hub tools

citation-role summary

citation-polarity summary

fields

years

verdicts

roles

polarities

representative citing papers

citing papers explorer