pith. sign in

hub

Why do tree-based models still outperform deep learning on tabular data?

17 Pith papers cite this work. Polarity classification is still indexing.

17 Pith papers citing it

hub tools

citation-role summary

background 4

citation-polarity summary

roles

background 4

polarities

background 4

representative citing papers

Learning Dynamic Stability Landscapes in Synchronization Networks

cs.LG · 2026-05-22 · unverdicted · novelty 7.0

Introduces graph-to-image prediction of per-node dynamic stability landscapes in oscillator networks from topology, releases two 10k-graph datasets, and shows GNN-CNN models achieve good accuracy with cross-size generalization.

Data Language Models: A New Foundation Model Class for Tabular Data

cs.AI · 2026-05-07 · unverdicted · novelty 7.0

Schema-1 is the first Data Language Model that natively understands raw tabular data and outperforms gradient-boosted ensembles, AutoML, and prior tabular foundation models on row-level prediction and imputation tasks.

AXIL: Exact Instance Attribution for Gradient Boosting

cs.LG · 2023-01-05 · conditional · novelty 7.0

AXIL computes exact fixed-structure instance attributions for squared-error GBMs via a matrix-free O(TN) backward operator, outperforming BoostIn/TREX/LeafInfluence on 20 regression datasets.

Prior-Aligned Data Cleaning for Tabular Foundation Models

cs.LG · 2026-04-28 · unverdicted · novelty 6.0

L2C2 is a deep RL framework that learns to clean tabular data by aligning it to the synthetic prior of tabular foundation models, yielding higher accuracy on some benchmarks and cross-dataset policy transfer.

UniRec: Unified Multimodal Encoding for LLM-Based Recommendations

cs.IR · 2026-01-27 · unverdicted · novelty 6.0

UniRec unifies heterogeneous recommendation modalities via specialized encoders, triplet representations, and hierarchical modeling to outperform prior multimodal LLM recommenders by up to 15% on benchmarks.

TabICL: A Tabular Foundation Model for In-Context Learning on Large Data

cs.LG · 2025-02-08 · unverdicted · novelty 6.0

TabICL scales in-context learning to large tabular data via column-then-row attention for row embeddings followed by a transformer, matching TabPFNv2 speed and performance while outperforming it and CatBoost on datasets over 10K samples.

Gradient Boosted Risk Scores

cs.LG · 2026-05-04 · conditional · novelty 5.0

Gradient boosting produces risk scores with competitive accuracy but 60% fewer rules on classification tasks and 16% fewer on time-to-event tasks than regression-based methods like AutoScore.

Kitchen Sink Anomaly Detection

hep-ph · 2026-04-22 · unverdicted · novelty 5.0

A combined kitchen sink observable set of Energy Flow Polynomials and subjettiness variables outperforms standard baselines in sensitivity to a wide range of resonant signals, with new public benchmarks released and an attribute bagging variant reducing training cost.

citing papers explorer

Showing 17 of 17 citing papers.