VISTA is a source-free TTA framework for multi-sequence MRI segmentation that uses inter-sequence spectral/patch interventions and cross-view variance gating to handle modality-interaction shifts, reporting Dice gains of 1.89% and 2.82% on SSA and PED cohorts.
Title resolution pending
7 Pith papers cite this work. Polarity classification is still indexing.
citation-role summary
citation-polarity summary
roles
background 1polarities
background 1representative citing papers
TextTeacher uses frozen text embeddings from captions as semantic anchors to guide vision model training, improving ImageNet accuracy by up to 2.7 p.p. and transfer performance by 1.0 p.p. on average.
CutMix augmentation during training induces spatial locality in early layers of Vision Transformers trained from scratch, as measured by reduced Mean Attention Distance.
Weak-to-strong knowledge distillation applied early and then turned off accelerates convergence to target performance in visual learning tasks by factors of 1.7-4.8x.
LGTrack achieves 258.7 FPS real-time UAV tracking with 82.8% precision on UAVDT by combining dynamic layer selection, Global-Grouped Coordinate Attention, and Similarity-Guided Layer Adaptation.
Nonlinear transformations enable DNNs to achieve substantial test accuracy gains (0.34% to 249.59%) on unlearnable CIFAR10 datasets from twelve protection methods, outperforming a recent linear baseline.
NTGA is the first clean-label generalization attack under black-box settings but is vulnerable to adversarial training and image transformations, with newer attacks outperforming it.
citing papers explorer
-
VISTA: Variance-Gated Inter-Sequence Test-Time Adaptation for Multi-Sequence MRI Segmentation
VISTA is a source-free TTA framework for multi-sequence MRI segmentation that uses inter-sequence spectral/patch interventions and cross-view variance gating to handle modality-interaction shifts, reporting Dice gains of 1.89% and 2.82% on SSA and PED cohorts.
-
TextTeacher: What Can Language Teach About Images?
TextTeacher uses frozen text embeddings from captions as semantic anchors to guide vision model training, improving ImageNet accuracy by up to 2.7 p.p. and transfer performance by 1.0 p.p. on average.
-
Inducing Spatial Locality in Vision Transformers through the Training Protocol
CutMix augmentation during training induces spatial locality in early layers of Vision Transformers trained from scratch, as measured by reduced Mean Attention Distance.
-
Weak-to-Strong Knowledge Distillation Accelerates Visual Learning
Weak-to-strong knowledge distillation applied early and then turned off accelerates convergence to target performance in visual learning tasks by factors of 1.7-4.8x.
-
Layer-Guided UAV Tracking: Enhancing Efficiency and Occlusion Robustness
LGTrack achieves 258.7 FPS real-time UAV tracking with 82.8% precision on UAVDT by combining dynamic layer selection, Global-Grouped Coordinate Attention, and Similarity-Guided Layer Adaptation.
-
Nonlinear Transformations Against Unlearnable Datasets
Nonlinear transformations enable DNNs to achieve substantial test accuracy gains (0.34% to 249.59%) on unlearnable CIFAR10 datasets from twelve protection methods, outperforming a recent linear baseline.
-
SoK: A Comprehensive Analysis of the Current Status of Neural Tangent Generalization Attacks with Research Directions
NTGA is the first clean-label generalization attack under black-box settings but is vulnerable to adversarial training and image transformations, with newer attacks outperforming it.