iBOT achieves 82.3% linear probing accuracy and 87.8% fine-tuning accuracy on ImageNet-1K using masked image modeling with a jointly trained online tokenizer.
SiT: Self-supervised vision transformer
3 Pith papers cite this work. Polarity classification is still indexing.
citation-role summary
citation-polarity summary
fields
cs.CV 3verdicts
UNVERDICTED 3roles
method 2polarities
use method 2representative citing papers
PRPO is a paragraph-level policy optimization technique that grounds vision-language model reasoning in image content to raise deepfake detection accuracy and reasoning quality.
Pre-training on modality-matched data significantly improves downstream performance in medical imaging models while self-supervised learning benefits depend on context.
citing papers explorer
-
iBOT: Image BERT Pre-Training with Online Tokenizer
iBOT achieves 82.3% linear probing accuracy and 87.8% fine-tuning accuracy on ImageNet-1K using masked image modeling with a jointly trained online tokenizer.
-
PRPO: Paragraph-level Policy Optimization for Vision-Language Deepfake Detection
PRPO is a paragraph-level policy optimization technique that grounds vision-language model reasoning in image content to raise deepfake detection accuracy and reasoning quality.
-
From pre-training to downstream performance: Does domain-specific pre-training make sense?
Pre-training on modality-matched data significantly improves downstream performance in medical imaging models while self-supervised learning benefits depend on context.