A new corpus of 108 mixed string-numeric tables shows that advanced tabular learners with basic string embeddings perform well on most real-world data, while large LLM encoders help on free-text heavy tables.
Olson, William La Cava, Patryk Orzechowski, Ryan J
3 Pith papers cite this work. Polarity classification is still indexing.
citation-role summary
citation-polarity summary
fields
cs.LG 3verdicts
UNVERDICTED 3roles
background 1polarities
background 1representative citing papers
TabDistill distills feature interactions from tabular foundation models via post-hoc attribution and inserts them into GAMs, yielding consistent predictive gains.
A CBR system based on similarity of local explanations provides visualizations that fraud analysts at a Dutch bank found useful and easy to use for processing ML-generated fraud alerts.
citing papers explorer
-
STRABLE: Benchmarking Tabular Machine Learning with Strings
A new corpus of 108 mixed string-numeric tables shows that advanced tabular learners with basic string embeddings perform well on most real-world data, while large LLM encoders help on free-text heavy tables.
-
Selecting Feature Interactions for Generalized Additive Models by Distilling Foundation Models
TabDistill distills feature interactions from tabular foundation models via post-hoc attribution and inserts them into GAMs, yielding consistent predictive gains.
-
Case-Based Reasoning for Assisting Domain Experts in Processing Fraud Alerts of Black-Box Machine Learning Models
A CBR system based on similarity of local explanations provides visualizations that fraud analysts at a Dutch bank found useful and easy to use for processing ML-generated fraud alerts.