Emerg- ing properties in self-supervised vision transformers

Mathilde Caron, Hugo Touvron, Ishan Misra, Herv ´e J´egou, Julien Mairal, Piotr Bojanowski, Armand Joulin · 2021

4 Pith papers cite this work. Polarity classification is still indexing.

4 Pith papers citing it

browse 4 citing papers

citation-role summary

method 1

citation-polarity summary

use method 1

representative citing papers

CineMatte: Background Matting for Virtual Production and Beyond

cs.CV · 2026-05-18 · unverdicted · novelty 7.0

CineMatte uses a cross-attention design on a Siamese DINOv3 ViT plus a pretrained upsampler to produce robust mattes for virtual production, backed by a new non-synthetic 4K VP dataset that supports camera motion.

Scene-Agnostic Object-Centric Representation Learning for 3D Gaussian Splatting

cs.CV · 2026-04-10 · unverdicted · novelty 6.0

A scene-agnostic object codebook learned via unsupervised object-centric learning provides consistent identity-anchored representations for 3D Gaussians across multiple scenes.

DetRefiner: Model-Agnostic Detection Refinement with Feature Fusion Transformer

cs.CV · 2026-05-11 · unverdicted · novelty 5.0

DetRefiner fuses global and local features with a Transformer to refine OVOD confidence scores, delivering up to +10.1 AP gains on novel categories across multiple datasets.

How to Embed Matters: Evaluation of EO Embedding Design Choices

cs.CV · 2026-03-11 · unverdicted · novelty 5.0

Transformer backbones with mean pooling and combined self-supervised embeddings yield robust, compact representations for EO tasks that are over 500x smaller than raw data.

citing papers explorer

Showing 4 of 4 citing papers.

CineMatte: Background Matting for Virtual Production and Beyond cs.CV · 2026-05-18 · unverdicted · none · ref 2
CineMatte uses a cross-attention design on a Siamese DINOv3 ViT plus a pretrained upsampler to produce robust mattes for virtual production, backed by a new non-synthetic 4K VP dataset that supports camera motion.
Scene-Agnostic Object-Centric Representation Learning for 3D Gaussian Splatting cs.CV · 2026-04-10 · unverdicted · none · ref 3
A scene-agnostic object codebook learned via unsupervised object-centric learning provides consistent identity-anchored representations for 3D Gaussians across multiple scenes.
DetRefiner: Model-Agnostic Detection Refinement with Feature Fusion Transformer cs.CV · 2026-05-11 · unverdicted · none · ref 3
DetRefiner fuses global and local features with a Transformer to refine OVOD confidence scores, delivering up to +10.1 AP gains on novel categories across multiple datasets.
How to Embed Matters: Evaluation of EO Embedding Design Choices cs.CV · 2026-03-11 · unverdicted · none · ref 5
Transformer backbones with mean pooling and combined self-supervised embeddings yield robust, compact representations for EO tasks that are over 500x smaller than raw data.

Emerg- ing properties in self-supervised vision transformers

citation-role summary

citation-polarity summary

fields

years

verdicts

roles

polarities

representative citing papers

citing papers explorer