Orion-MSP: Multi-scale sparse attention for tabular in-context learning

Orion-MSP: Multi-Scale Sparse Attention for Tabular In-Context Learning · 2025 · arXiv 2511.02818

9 Pith papers cite this work. Polarity classification is still indexing.

9 Pith papers citing it

read on arXiv browse 9 citing papers

citation-role summary

background 2

citation-polarity summary

background 2

representative citing papers

Rethinking Weak Supervision in Anomaly Detection: A Comprehensive Benchmark

cs.LG · 2026-05-25 · accept · novelty 7.0

WSADBench unifies WSAD evaluation across three supervision types, runs 700K experiments on 36 algorithms and 4 modalities, and finds strong correlations between scenarios plus performance boundaries favoring general models except in extreme label scarcity.

MulTaBench: Benchmarking Multimodal Tabular Learning with Text and Image

cs.LG · 2026-05-11 · unverdicted · novelty 7.0

MulTaBench is a new collection of 40 image-tabular and text-tabular datasets designed to test target-aware representation tuning in multimodal tabular models.

TFM-Retouche: A Lightweight Input-Space Adapter for Tabular Foundation Models

cs.LG · 2026-05-07 · unverdicted · novelty 7.0 · 2 refs

TFM-Retouche is an architecture-agnostic input-space residual adapter that improves tabular foundation model accuracy on 51 datasets by learning input corrections through the frozen backbone, with an identity guard to fall back to the original model.

Shaping the Prior: How Synthetic Task Distributions Determine Tabular Foundation Model Quality

cs.LG · 2026-05-18 · unverdicted · novelty 6.0

O'Prior, a compositional synthetic prior with hierarchical SCMs, realism engines, stress modules, and curriculum protocols, improves tabular foundation model accuracy and robustness on real benchmarks when architecture and compute are held fixed.

Pocket Foundation Models: Distilling TFMs into CPU-Ready Gradient-Boosted Trees

cs.LG · 2026-05-18 · accept · novelty 6.0

Distilling TabICLv2 into XGBoost via stratified OOF labeling yields 0.882 macro-mean AUC (96.5% of teacher) at 1.9 ms CPU across 153 datasets, with significant gains over tuned CatBoost on low-dimensional data.

Distilling Tabular Foundation Models for Structured Health Data

cs.LG · 2026-05-18 · unverdicted · novelty 5.0

Leakage-aware distillation transfers at least 90% of tabular foundation model AUC to lightweight students across 19 health datasets, with 26x CPU speedup and preserved calibration/fairness.

Ensembling Tabular Foundation Models - A Diversity Ceiling And A Calibration Trap

cs.LG · 2026-05-18 · unverdicted · novelty 5.0

Six modern tabular foundation models are near-redundant, limiting ensemble gains to +0.18% accuracy at high cost while some methods degrade calibration.

Data Presentation Over Architecture: Resampling Strategies for Credit Risk Prediction with Tabular Foundation Models

cs.LG · 2026-05-18 · unverdicted · novelty 4.0

Context construction strategies such as balanced sampling improve AUC-ROC by 3-4 points over uniform sampling in tabular foundation models for credit risk, exceeding differences between model families and matching classical baselines.

Noise Immunity in In-Context Tabular Learning: An Empirical Robustness Analysis of TabPFN's Attention Mechanisms

cs.LG · 2026-04-06 · unverdicted · novelty 4.0

TabPFN maintains high ROC-AUC and structured attention under controlled additions of irrelevant features, nonlinear correlations, and mislabeled targets in binary classification.

citing papers explorer

Showing 2 of 2 citing papers after filters.

Rethinking Weak Supervision in Anomaly Detection: A Comprehensive Benchmark cs.LG · 2026-05-25 · accept · none · ref 10
WSADBench unifies WSAD evaluation across three supervision types, runs 700K experiments on 36 algorithms and 4 modalities, and finds strong correlations between scenarios plus performance boundaries favoring general models except in extreme label scarcity.
Pocket Foundation Models: Distilling TFMs into CPU-Ready Gradient-Boosted Trees cs.LG · 2026-05-18 · accept · none · ref 10
Distilling TabICLv2 into XGBoost via stratified OOF labeling yields 0.882 macro-mean AUC (96.5% of teacher) at 1.9 ms CPU across 153 datasets, with significant gains over tuned CatBoost on low-dimensional data.

Orion-MSP: Multi-scale sparse attention for tabular in-context learning

citation-role summary

citation-polarity summary

fields

years

verdicts

roles

polarities

representative citing papers

citing papers explorer