Smaller self-supervised ViTs localize objects better via attention than larger ViTs, enabling A² to decouple localization from feature extraction for competitive performance on distribution-shifted benchmarks.
arXiv preprint arXiv:2206.11646 , year=
5 Pith papers cite this work. Polarity classification is still indexing.
citation-role summary
citation-polarity summary
verdicts
UNVERDICTED 5roles
baseline 1polarities
baseline 1representative citing papers
On-policy self-distillation with sampled demonstrations reduces rollout diversity by amplifying existing probability gaps in the base model, unlike ideal RL which preserves ratios among correct outputs.
Evaluates four distribution shifts in sensor-based HAR, finds diversity shifts dominate, and shows 28 DG methods only marginally beat ERM while releasing open benchmarks.
TabICL scales in-context learning to large tabular data via column-then-row attention for row embeddings followed by a transformer, matching TabPFNv2 speed and performance while outperforming it and CatBoost on datasets over 10K samples.
An iterative bootstrapped self-filtering approach selects balanced clean and diverse subsets from noisy vision-language datasets to train improved CLIP models.
citing papers explorer
-
Assessing Distribution Shift in Human Activity Recognition for Domain Generalization
Evaluates four distribution shifts in sensor-based HAR, finds diversity shifts dominate, and shows 28 DG methods only marginally beat ERM while releasing open benchmarks.