Smaller self-supervised ViTs localize objects better via attention than larger ViTs, enabling A² to decouple localization from feature extraction for competitive performance on distribution-shifted benchmarks.
Invariant Causal Mechanisms through Distribution Matching.arXiv e-prints, art
2 Pith papers cite this work. Polarity classification is still indexing.
2
Pith papers citing it
citation-role summary
baseline 1
citation-polarity summary
verdicts
UNVERDICTED 2roles
baseline 1polarities
baseline 1representative citing papers
TabICL scales in-context learning to large tabular data via column-then-row attention for row embeddings followed by a transformer, matching TabPFNv2 speed and performance while outperforming it and CatBoost on datasets over 10K samples.
citing papers explorer
-
$A^2$: Smaller Self-Supervised ViTs Localize Better than Larger Ones
Smaller self-supervised ViTs localize objects better via attention than larger ViTs, enabling A² to decouple localization from feature extraction for competitive performance on distribution-shifted benchmarks.