iBOT achieves 82.3% linear probing accuracy and 87.8% fine-tuning accuracy on ImageNet-1K using masked image modeling with a jointly trained online tokenizer.
For DINO, we directly use the projection head for [CLS] token and generate a 65536-d probability distribution for each patch token
1 Pith paper cite this work. Polarity classification is still indexing.
1
Pith paper citing it
fields
cs.CV 1years
2021 1verdicts
UNVERDICTED 1representative citing papers
citing papers explorer
-
iBOT: Image BERT Pre-Training with Online Tokenizer
iBOT achieves 82.3% linear probing accuracy and 87.8% fine-tuning accuracy on ImageNet-1K using masked image modeling with a jointly trained online tokenizer.