hub Mixed citations

AutoGluon-Tabular: Robust and Accurate AutoML for Structured Data

Nick Erickson, Jonas Mueller, Alexander Shirkov, Hang Zhang, Pedro Larroy, Mu Li · 2020 · stat.ML · arXiv 2003.06505

Mixed citation behavior. Most common role is baseline (33%).

37 Pith papers citing it

Baseline 33% of classified citations

open full Pith review browse 37 citing papers arXiv PDF

abstract

We introduce AutoGluon-Tabular, an open-source AutoML framework that requires only a single line of Python to train highly accurate machine learning models on an unprocessed tabular dataset such as a CSV file. Unlike existing AutoML frameworks that primarily focus on model/hyperparameter selection, AutoGluon-Tabular succeeds by ensembling multiple models and stacking them in multiple layers. Experiments reveal that our multi-layer combination of many models offers better use of allocated training time than seeking out the best. A second contribution is an extensive evaluation of public and commercial AutoML platforms including TPOT, H2O, AutoWEKA, auto-sklearn, AutoGluon, and Google AutoML Tables. Tests on a suite of 50 classification and regression tasks from Kaggle and the OpenML AutoML Benchmark reveal that AutoGluon is faster, more robust, and much more accurate. We find that AutoGluon often even outperforms the best-in-hindsight combination of all of its competitors. In two popular Kaggle competitions, AutoGluon beat 99% of the participating data scientists after merely 4h of training on the raw data.

hub tools

JSON dossier citing papers JSON arXiv source

citation-role summary

baseline 3 method 3 background 2 dataset 1

citation-polarity summary

baseline 3 use method 3 background 2 use dataset 1

representative citing papers

FLOATBench: A Dataset and Benchmark for Floating Offshore Wind Turbine Tower Fatigue

cs.AI · 2026-05-25 · unverdicted · novelty 8.0

FLOATBench is a tabular benchmark dataset with 582,120 fatigue labels from 19,404 OpenFAST simulations of three 22 MW FOWT towers, featuring alpha-shape regime partitioning and three evaluation protocols for surrogate models.

TabArena: A Living Benchmark for Machine Learning on Tabular Data

cs.LG · 2025-06-20 · conditional · novelty 8.0

TabArena launches a dynamic, updatable benchmarking system for tabular ML that shows boosted trees remain competitive, deep learning matches them under larger budgets with ensembling, foundation models excel on small data, and cross-model ensembles advance SOTA while flagging validation overfitting.

TabPFN: A Transformer That Solves Small Tabular Classification Problems in a Second

cs.LG · 2022-07-05 · conditional · novelty 8.0

TabPFN is a Prior-Data Fitted Network that approximates Bayesian inference for small tabular classification by training a Transformer once on synthetic data drawn from a causal prior, then solves new tasks in a single forward pass without further updates.

Beyond IID: How General Are Tabular Foundation Models, Really?

cs.LG · 2026-06-29 · unverdicted · novelty 7.0

Tabular foundation models excel on tiny- to medium-sized IID data but are outperformed by traditional tree-based and deep learning models on non-IID, large, and high-dimensional datasets, based on evaluations across 11 models and 142 datasets in the new BeyondArena benchmark.

FlexTab: A Flexible Encoder-Decoder Architecture for In-Context Learning Across Diverse Tabular Tasks

cs.LG · 2026-06-29 · unverdicted · novelty 7.0 · 2 refs

FlexTab shows a shared encoder with task-specific decoders trained on unlabeled tables can achieve SOTA on classification, regression, anomaly detection and entity matching while staying competitive on relational entity classification.

TabPrep: Closing the Feature Engineering Gap in Tabular Benchmarks

cs.LG · 2026-06-01 · unverdicted · novelty 7.0

TabPrep is a new feature engineering pipeline that targets three data patterns and improves performance of tree-based, neural, linear, and foundation models on tabular benchmarks, often more than model architecture changes.

1GC-7RC: One Graphic Card -- Seven Research Challenges! How Good Are AI Agents at Doing Your Job?

cs.LG · 2026-05-16 · unverdicted · novelty 7.0

Introduces the 1GC-7RC benchmark to evaluate AI coding agents on seven diverse ML tasks under single-GPU time and access constraints.

PromptDx: Differentiable Prompt Tuning for Multimodal In-Context Alzheimer's Diagnosis

cs.CV · 2026-05-09 · unverdicted · novelty 7.0

PromptDx adds a differentiable adapter to align multimodal data with a pre-trained TabPFN-style ICL engine, achieving strong Alzheimer's diagnosis performance with only 1% context samples.

Data Language Models: A New Foundation Model Class for Tabular Data

cs.AI · 2026-05-07 · unverdicted · novelty 7.0

Schema-1 is the first Data Language Model that natively understands raw tabular data and outperforms gradient-boosted ensembles, AutoML, and prior tabular foundation models on row-level prediction and imputation tasks.

RamanBench: A Large-Scale Benchmark for Machine Learning on Raman Spectroscopy

cs.LG · 2026-05-03 · unverdicted · novelty 7.0

RamanBench unifies 74 datasets into the first large-scale reproducible benchmark for ML on Raman spectra, finding tabular foundation models outperform baselines but no method generalizes across datasets.

Probabilistic Spectral Reconstruction of Trans-Neptunian Objects from Sparse Photometry: A Framework for Taxonomy, Survey Optimization, and Outlier Detection

astro-ph.EP · 2026-04-26 · unverdicted · novelty 7.0 · 2 refs

Probabilistic PCA latent-space model with Bayesian inference reconstructs TNO near-IR spectra from photometry, achieving 95% credible-interval coverage and supporting taxonomy plus survey optimization.

KompeteAI: Accelerated Autonomous Multi-Agent System for End-to-End Pipeline Generation for Machine Learning Problems

cs.AI · 2025-08-13 · unverdicted · novelty 7.0

KompeteAI accelerates AutoML pipeline evaluation 6.9 times and beats prior systems by 3% on MLE-Bench through candidate merging, external RAG, and predictive early scoring.

How Can Machine Learning Accelerate CALPHAD Free Energy Modeling?

cond-mat.mtrl-sci · 2026-05-31 · unverdicted · novelty 6.0

Hybrid ML models learn Redlich-Kister coefficients from elemental descriptors to enable zero-shot extrapolation of CALPHAD interaction parameters for unseen elements in FCC alloys.

TabPFN-3: Technical Report

cs.LG · 2026-05-13 · unverdicted · novelty 6.0 · 2 refs

TabPFN-3 scales tabular foundation models to 1M rows with synthetic pretraining, test-time compute, and benchmark-leading performance on tabular, relational, and tabular-text tasks while being up to 20x faster than TabPFN-2.5.

CarCrashNet: A Large-Scale Dataset and Hierarchical Neural Solver for Data-Driven Structural Crash Simulation

cs.LG · 2026-05-08 · accept · novelty 6.0 · 2 refs

CarCrashNet supplies a large multi-modal crash simulation benchmark and CrashSolver neural model for data-driven full-vehicle crash prediction, validated against experiments and commercial solvers.

Prior-Aligned Data Cleaning for Tabular Foundation Models

cs.LG · 2026-04-28 · unverdicted · novelty 6.0

L2C2 is a deep RL framework that learns to clean tabular data by aligning it to the synthetic prior of tabular foundation models, yielding higher accuracy on some benchmarks and cross-dataset policy transfer.

Tabular foundation models for in-context prediction of molecular properties

cs.LG · 2026-04-17 · unverdicted · novelty 6.0

Tabular foundation models achieve high accuracy in molecular property prediction through in-context learning, with up to 100% win rates on MoleculeACE tasks when paired with CheMeleon embeddings.

AgentGA: Evolving Code Solutions in Agent-Seed Space

cs.AI · 2026-04-16 · unverdicted · novelty 6.0 · 2 refs

AgentGA optimizes agent seeds with genetic algorithms and parent-archive inheritance to improve autonomous code generation, beating a baseline on 15 of 16 Kaggle competitions.

TREX: Automating LLM Fine-tuning via Agent-Driven Tree-based Exploration

cs.AI · 2026-04-15 · unverdicted · novelty 6.0

TREX automates the LLM training lifecycle via collaborative agents and tree-based exploration, delivering consistent performance gains across 10 real-world fine-tuning tasks in FT-Bench.

KumoRFM-2: Scaling Foundation Models for Relational Learning

cs.LG · 2026-04-14 · unverdicted · novelty 6.0

KumoRFM-2 pre-trains on synthetic and real relational data across row, column, foreign-key and cross-sample axes, injects task information early, and achieves up to 8% gains over supervised baselines on 41 benchmarks in few-shot and fine-tuned regimes while handling billion-scale datasets.

Auto-Unrolled Proximal Gradient Descent: An AutoML Approach to Interpretable Waveform Optimization

cs.LG · 2026-03-18 · unverdicted · novelty 6.0

Auto-unrolled PGD with AutoML tuning reaches 98.8% of 200-iteration solver spectral efficiency using only 5 layers and 100 samples.

FEAT: A Linear-Complexity Foundation Model for Extremely Large Structured Data

cs.LG · 2026-03-17 · unverdicted · novelty 6.0

FEAT is a linear-complexity structured data foundation model using dual-axis encoding, AFBM state-space models, and Conv-GLA to achieve O(N) scaling and permutation invariance while outperforming prior SFMs on real-world benchmarks.

TabPFN-2.5: Advancing the State of the Art in Tabular Foundation Models

cs.LG · 2025-11-11 · unverdicted · novelty 6.0

TabPFN-2.5 scales tabular foundation models to 20x larger datasets, outperforms tuned tree models on TabArena, achieves near-perfect win rates against default XGBoost, and adds a distillation engine for fast production deployment.

Temporal Posed and Spontaneous Gesture Recognition from Electromyography in the Rock-Paper-Scissors Game

cs.LG · 2026-06-28 · unverdicted · novelty 5.0

Forearm EMG signals precede visible RPS gestures by hundreds of milliseconds and enable 63.4% accuracy for posed gestures plus 65% peak accuracy for inferring gestures from opponent reactions.

citing papers explorer

Showing 37 of 37 citing papers.

FLOATBench: A Dataset and Benchmark for Floating Offshore Wind Turbine Tower Fatigue cs.AI · 2026-05-25 · unverdicted · none · ref 45 · internal anchor
FLOATBench is a tabular benchmark dataset with 582,120 fatigue labels from 19,404 OpenFAST simulations of three 22 MW FOWT towers, featuring alpha-shape regime partitioning and three evaluation protocols for surrogate models.
TabArena: A Living Benchmark for Machine Learning on Tabular Data cs.LG · 2025-06-20 · conditional · none · ref 19 · internal anchor
TabArena launches a dynamic, updatable benchmarking system for tabular ML that shows boosted trees remain competitive, deep learning matches them under larger budgets with ensembling, foundation models excel on small data, and cross-model ensembles advance SOTA while flagging validation overfitting.
TabPFN: A Transformer That Solves Small Tabular Classification Problems in a Second cs.LG · 2022-07-05 · conditional · none · ref 6 · internal anchor
TabPFN is a Prior-Data Fitted Network that approximates Bayesian inference for small tabular classification by training a Transformer once on synthetic data drawn from a causal prior, then solves new tasks in a single forward pass without further updates.
Beyond IID: How General Are Tabular Foundation Models, Really? cs.LG · 2026-06-29 · unverdicted · none · ref 182 · internal anchor
Tabular foundation models excel on tiny- to medium-sized IID data but are outperformed by traditional tree-based and deep learning models on non-IID, large, and high-dimensional datasets, based on evaluations across 11 models and 142 datasets in the new BeyondArena benchmark.
FlexTab: A Flexible Encoder-Decoder Architecture for In-Context Learning Across Diverse Tabular Tasks cs.LG · 2026-06-29 · unverdicted · none · ref 16 · 2 links · internal anchor
FlexTab shows a shared encoder with task-specific decoders trained on unlabeled tables can achieve SOTA on classification, regression, anomaly detection and entity matching while staying competitive on relational entity classification.
TabPrep: Closing the Feature Engineering Gap in Tabular Benchmarks cs.LG · 2026-06-01 · unverdicted · none · ref 80 · internal anchor
TabPrep is a new feature engineering pipeline that targets three data patterns and improves performance of tree-based, neural, linear, and foundation models on tabular benchmarks, often more than model architecture changes.
1GC-7RC: One Graphic Card -- Seven Research Challenges! How Good Are AI Agents at Doing Your Job? cs.LG · 2026-05-16 · unverdicted · none · ref 14 · internal anchor
Introduces the 1GC-7RC benchmark to evaluate AI coding agents on seven diverse ML tasks under single-GPU time and access constraints.
PromptDx: Differentiable Prompt Tuning for Multimodal In-Context Alzheimer's Diagnosis cs.CV · 2026-05-09 · unverdicted · none · ref 5 · internal anchor
PromptDx adds a differentiable adapter to align multimodal data with a pre-trained TabPFN-style ICL engine, achieving strong Alzheimer's diagnosis performance with only 1% context samples.
Data Language Models: A New Foundation Model Class for Tabular Data cs.AI · 2026-05-07 · unverdicted · none · ref 4 · internal anchor
Schema-1 is the first Data Language Model that natively understands raw tabular data and outperforms gradient-boosted ensembles, AutoML, and prior tabular foundation models on row-level prediction and imputation tasks.
RamanBench: A Large-Scale Benchmark for Machine Learning on Raman Spectroscopy cs.LG · 2026-05-03 · unverdicted · none · ref 67 · internal anchor
RamanBench unifies 74 datasets into the first large-scale reproducible benchmark for ML on Raman spectra, finding tabular foundation models outperform baselines but no method generalizes across datasets.
Probabilistic Spectral Reconstruction of Trans-Neptunian Objects from Sparse Photometry: A Framework for Taxonomy, Survey Optimization, and Outlier Detection astro-ph.EP · 2026-04-26 · unverdicted · none · ref 21 · 2 links · internal anchor
Probabilistic PCA latent-space model with Bayesian inference reconstructs TNO near-IR spectra from photometry, achieving 95% credible-interval coverage and supporting taxonomy plus survey optimization.
KompeteAI: Accelerated Autonomous Multi-Agent System for End-to-End Pipeline Generation for Machine Learning Problems cs.AI · 2025-08-13 · unverdicted · none · ref 4 · internal anchor
KompeteAI accelerates AutoML pipeline evaluation 6.9 times and beats prior systems by 3% on MLE-Bench through candidate merging, external RAG, and predictive early scoring.
How Can Machine Learning Accelerate CALPHAD Free Energy Modeling? cond-mat.mtrl-sci · 2026-05-31 · unverdicted · none · ref 23 · internal anchor
Hybrid ML models learn Redlich-Kister coefficients from elemental descriptors to enable zero-shot extrapolation of CALPHAD interaction parameters for unseen elements in FCC alloys.
TabPFN-3: Technical Report cs.LG · 2026-05-13 · unverdicted · none · ref 2 · 2 links · internal anchor
TabPFN-3 scales tabular foundation models to 1M rows with synthetic pretraining, test-time compute, and benchmark-leading performance on tabular, relational, and tabular-text tasks while being up to 20x faster than TabPFN-2.5.
CarCrashNet: A Large-Scale Dataset and Hierarchical Neural Solver for Data-Driven Structural Crash Simulation cs.LG · 2026-05-08 · accept · none · ref 32 · 2 links · internal anchor
CarCrashNet supplies a large multi-modal crash simulation benchmark and CrashSolver neural model for data-driven full-vehicle crash prediction, validated against experiments and commercial solvers.
Prior-Aligned Data Cleaning for Tabular Foundation Models cs.LG · 2026-04-28 · unverdicted · none · ref 8 · internal anchor
L2C2 is a deep RL framework that learns to clean tabular data by aligning it to the synthetic prior of tabular foundation models, yielding higher accuracy on some benchmarks and cross-dataset policy transfer.
Tabular foundation models for in-context prediction of molecular properties cs.LG · 2026-04-17 · unverdicted · none · ref 38 · internal anchor
Tabular foundation models achieve high accuracy in molecular property prediction through in-context learning, with up to 100% win rates on MoleculeACE tasks when paired with CheMeleon embeddings.
AgentGA: Evolving Code Solutions in Agent-Seed Space cs.AI · 2026-04-16 · unverdicted · none · ref 2 · 2 links · internal anchor
AgentGA optimizes agent seeds with genetic algorithms and parent-archive inheritance to improve autonomous code generation, beating a baseline on 15 of 16 Kaggle competitions.
TREX: Automating LLM Fine-tuning via Agent-Driven Tree-based Exploration cs.AI · 2026-04-15 · unverdicted · none · ref 9 · internal anchor
TREX automates the LLM training lifecycle via collaborative agents and tree-based exploration, delivering consistent performance gains across 10 real-world fine-tuning tasks in FT-Bench.
KumoRFM-2: Scaling Foundation Models for Relational Learning cs.LG · 2026-04-14 · unverdicted · none · ref 4 · internal anchor
KumoRFM-2 pre-trains on synthetic and real relational data across row, column, foreign-key and cross-sample axes, injects task information early, and achieves up to 8% gains over supervised baselines on 41 benchmarks in few-shot and fine-tuned regimes while handling billion-scale datasets.
Auto-Unrolled Proximal Gradient Descent: An AutoML Approach to Interpretable Waveform Optimization cs.LG · 2026-03-18 · unverdicted · none · ref 11 · internal anchor
Auto-unrolled PGD with AutoML tuning reaches 98.8% of 200-iteration solver spectral efficiency using only 5 layers and 100 samples.
FEAT: A Linear-Complexity Foundation Model for Extremely Large Structured Data cs.LG · 2026-03-17 · unverdicted · none · ref 13 · internal anchor
FEAT is a linear-complexity structured data foundation model using dual-axis encoding, AFBM state-space models, and Conv-GLA to achieve O(N) scaling and permutation invariance while outperforming prior SFMs on real-world benchmarks.
TabPFN-2.5: Advancing the State of the Art in Tabular Foundation Models cs.LG · 2025-11-11 · unverdicted · none · ref 32 · internal anchor
TabPFN-2.5 scales tabular foundation models to 20x larger datasets, outperforms tuned tree models on TabArena, achieves near-perfect win rates against default XGBoost, and adds a distillation engine for fast production deployment.
Temporal Posed and Spontaneous Gesture Recognition from Electromyography in the Rock-Paper-Scissors Game cs.LG · 2026-06-28 · unverdicted · none · ref 21 · internal anchor
Forearm EMG signals precede visible RPS gestures by hundreds of milliseconds and enable 63.4% accuracy for posed gestures plus 65% peak accuracy for inferring gestures from opponent reactions.
Ensembling Tabular Foundation Models - A Diversity Ceiling And A Calibration Trap cs.LG · 2026-05-18 · unverdicted · none · ref 26 · internal anchor
Six modern tabular foundation models are near-redundant, limiting ensemble gains to +0.18% accuracy at high cost while some methods degrade calibration.
TabH2O: A Unified Foundation Model for Tabular Prediction cs.LG · 2026-05-18 · unverdicted · none · ref 5 · internal anchor
TabH2O presents a unified tabular foundation model with dual-head architecture and single-stage pretraining that achieves an average rank of 2.55 on the TALENT benchmark, outperforming several established methods.
Inferring stellar metallicity and elemental abundances from kinematic and spectroscopic data using machine learning -- Implications for exoplanet host stars astro-ph.EP · 2026-05-18 · unverdicted · none · ref 30 · internal anchor
ML regressors trained on APOGEE DR17 red giants predict C, O, Mg, Si abundances from kinematics and [Fe/H] more accurately than [Fe/H] baseline, with external validation on HARPS FGK dwarfs and reproduction of Galactic chemical evolution trends.
Mind the Gap? A Distributional Comparison of Real and Synthetic Priors for Tabular Foundation Models cs.AI · 2026-05-07 · unverdicted · none · ref 12 · internal anchor
The synthetic prior for tabular foundation models covers only a narrow part of real table distributions, but this mismatch does not degrade model generalization.
DPU or GPU for Accelerating Neural Networks Inference -- Why not both? Split CNN Inference cs.AR · 2026-04-30 · unverdicted · none · ref 22 · 2 links · internal anchor
Split CNN Inference partitions layers between DPU and GPU with a GNN predictor, reporting up to 3.37x latency reduction over single-device runs and 96.27% GNN accuracy on tested models.
Spatial Atlas: Compute-Grounded Reasoning for Spatial-Aware Research Agent Benchmarks cs.AI · 2026-04-13 · unverdicted · none · ref 5 · internal anchor
Spatial Atlas implements compute-grounded reasoning via a structured scene graph engine and deterministic computations to deliver competitive accuracy on spatial QA and Kaggle ML benchmarks while preserving interpretability.
TusoAI: Agentic Optimization for Scientific Methods cs.AI · 2025-09-28 · unverdicted · none · ref 9 · internal anchor
TusoAI is an LLM-based agent that builds and iteratively optimizes domain-specific computational methods for scientific data analysis, outperforming expert baselines on RNA-seq denoising and earth monitoring while reporting new genetic associations.
Retrieval-Augmented Generation with Graphs (GraphRAG) cs.IR · 2024-12-31 · unverdicted · none · ref 101 · internal anchor
A survey proposing a holistic GraphRAG framework with components including query processor, retriever, organizer, generator, and data source, plus domain-tailored reviews, challenges, and future directions.
Customer Churn Prediction on Structured Data Using FT-Transformer and Stacking Ensembles cs.LG · 2026-05-26 · unverdicted · none · ref 35 · internal anchor
A stacking ensemble of FT-Transformer and XGBoost achieves superior F1 and AUC scores on a bank churn dataset compared to an MLP baseline under cross-validation.
A Reproducible Log-Driven AutoML Framework for Interpretable Pipeline Optimization in Healthcare Risk Prediction cs.LG · 2026-05-19 · unverdicted · none · ref 8 · 2 links · internal anchor
yvsoucom-iterkit shows that performance on two healthcare datasets is dominated by a small subset of interacting pipeline components, allowing constrained search spaces to improve efficiency, stability, and interpretability.
Why Model Selection Fails in Time Series Forecasting: An Empirical Study of Instability Across Data Regimes eess.SP · 2026-05-02 · unverdicted · none · ref 26 · internal anchor
Rule-based model selection in time series forecasting achieves low accuracy and exhibits high ranking instability across data regimes and forecasting horizons.
XAI and Statistical Analysis for Reliable Intrusion Detection in the UAVIDS-2025 Dataset: From Tree to Hybrid and Tabular DNN Ensembles cs.CR · 2026-05-13 · unverdicted · none · ref 15 · internal anchor
XGBoost with SHAP and statistical distribution analysis on UAVIDS-2025 identifies density support intersection as the cause of false predictions for Wormhole and Blackhole attacks in UAV intrusion detection.
Explaining Tabular Foundation Model Differences Through Meta-Features cs.LG · 2026-05-27 · unreviewed · ref 1 · internal anchor

AutoGluon-Tabular: Robust and Accurate AutoML for Structured Data

hub tools

citation-role summary

citation-polarity summary

fields

years

verdicts

roles

polarities

representative citing papers

citing papers explorer